Python POST 请求不从后端返回数据

问题描述 投票:0回答:0

最初我是正常抓取这个网页的,但最近更新了它,所以现在我的请求在 HTML 正文中返回 Javascript。无论如何,我决定更改我的代码,以便通过 POST 请求将数据拉到后端。

我要抓取的页面是 https://www.tesco.ie/groceries/en-IE/shop/fresh-food/fresh-fruit/all?page=1&count=48,我的代码看起来像这样:

import json
import requests

headers = {
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) '
                          'AppleWebKit/537.36 (KHTML, like Gecko) '
                          'Chrome/112.0.0.0 Safari/537.36 Edg/112.0.1722.64',
            'Content-Type': 'application/json; charset=UTF-8',
            'Accept-Language': 'en-US,en;q=0.9',
            'X-Requested-With': 'XMLHttpRequest',
        }


payload = {
    "resources": [
        {"type": "appState",
        "params": {},
        "hash": "8608229003782371"},

        {"type": "trolleyContents",
        "params": {},
        "hash": "2574718136506441"},

        {"type": "productsByCategory",
        "params": {
            "aisle": "all",
            "department": "fresh-fruit",
            "query": {
                "count": "48",
                "page": "1"},
            "superdepartment": "fresh-food"
        },
        "hash": "4571228679394986"}
    ],

    "sharedParams": {
        "superdepartment": "fresh-food",
        "department": "fresh-fruit",
        "aisle": "all",
        "referer": "/groceries/en-IE/shop/fresh-food/fresh-fruit/all?page=5&count=48",
        "query": {
            "count": "48",
            "page": "1"
        }
    },
    "requiresAuthentication": "false"
}

server_url = 'https://www.tesco.ie/groceries/en-IE/resources'

with requests.Session() as s:

    data = s.post(
        server_url,
        headers=headers,
        data=json.dumps(payload)
    )
    print(data)

这段代码返回,我知道这只是服务器没有满足我的请求,但我不确定为什么。我的有效载荷是加载时来自网页 XHR 的请求有效载荷的直接副本。我对此不是很精通,所以我想知道我是否遗漏了什么?

抱歉,如果我只是缺少一些简单的东西。感谢您的帮助!

python python-3.x post python-requests
© www.soinside.com 2019 - 2024. All rights reserved.