通过request.get()获取不完整的内容

问题描述 投票:0回答:1

每次我的代码中都有这样的网址: https://www.pinterest.com/resource/BaseSearchResource/get/?source_url=%2Fsearch%2Fpins%2F%3Fq%3Dyellow%2520car%2520on%2520T恤%26rs%3Dtyped%26term_meta%5B%5D%3Dyellow %257Ctyped%26term_meta%5B%5D%3Dcar%257Ctyped%26term_meta%5B%5D%3Don%257Ctyped%26term_meta%5B%5D%3DT-shirt%257Ctyped&data=%7B%22options%22%3A%20%7B%22isPrefetch% 22%3A%20false%2C%20%22auto_ Correction_disabled%22%3A%20false%2C%20%22query%22%3A%20%22yellow%20car%20on%20T恤%22%2C%20%22redux_normalize_feed%22% 3A%20true%2C%20%22rs%22%3A%20%22typed%22%2C%20%22scope%22%3A%20%22pins%22%2C%20%22page_size%22%3A%2050%2C% 20%22书签%22%3A%20%5Bnull%5D%7D%2C%20%22上下文%22%3A%20null%7D&_=1729500369393

这包含我在

Pinterest()
中搜索时的查询结果,结果链接是一本字典,但是当我尝试使用
requests.get()
获取内容时,内容将不完整,我丢失了很多图像

url='https://www.pinterest.com/resource/BaseSearchResource/get/?source_url=%2Fsearch%2Fpins%2F%3Fq%3Dyellow%2520car%2520on%2520T-shirt%26rs%3Dtyped%26term_meta%5B%5D%3Dyellow%257Ctyped%26term_meta%5B%5D%3Dcar%257Ctyped%26term_meta%5B%5D%3Don%257Ctyped%26term_meta%5B%5D%3DT-shirt%257Ctyped&data=%7B%22options%22%3A%20%7B%22isPrefetch%22%3A%20false%2C%20%22auto_correction_disabled%22%3A%20false%2C%20%22query%22%3A%20%22yellow%20car%20on%20T-shirt%22%2C%20%22redux_normalize_feed%22%3A%20true%2C%20%22rs%22%3A%20%22typed%22%2C%20%22scope%22%3A%20%22pins%22%2C%20%22page_size%22%3A%2050%2C%20%22bookmarks%22%3A%20%5Bnull%5D%7D%2C%20%22context%22%3A%20null%7D&_=1729500369393'

response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON data
    data = json.loads(response.text)

当我将

data
变量内容与浏览器中的内容进行比较时,我发现许多不同和缺失的数据

我尝试增加

timeout=10
,认为问题是由于链接中的内容较多,但同样的问题,我也尝试使用其他库,如
urllib.request
http.client
,但我遇到了同样的问题,也许是使用时遇到问题

http https python-requests urllib
1个回答
0
投票

我认为这是因为网络服务知道您是通过代码而不是从浏览器获取数据。

尝试向您的请求添加模仿浏览器的标头,包括用户代理和任何相关 cookie。这会让网络服务看起来像是来自浏览器。

import requests

url = 'https://www.pinterest.com/resource/BaseSearchResource/get/?source_url=%2Fsearch%2Fpins%2F%3Fq%3Dyellow%2520car%2520on%2520T-shirt%26rs%3Dtyped%26term_meta%5B%5D%3Dyellow%257Ctyped%26term_meta%5B%5D%3Dcar%257Ctyped%26term_meta%5B%5D%3Don%257Ctyped%26term_meta%5B%5D%3DT-shirt%257Ctyped&data=%7B%22options%22%3A%20%7B%22isPrefetch%22%3A%20false%2C%20%22auto_correction_disabled%22%3A%20false%2C%20%22query%22%3A%20%22yellow%20car%20on%20T-shirt%22%2C%20%22redux_normalize_feed%22%3A%20true%2C%20%22rs%22%3A%20%22typed%22%2C%20%22scope%22%3A%20%22pins%22%2C%20%22page_size%22%3A%2050%2C%20%22bookmarks%22%3A%20%5Bnull%5D%7D%2C%20%22context%22%3A%20null%7D&_=1729500369393'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept': 'application/json, text/javascript, */*; q=0.01',
    'Referer': 'https://www.pinterest.com/',
}

response = requests.get(url, headers=headers)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON data
    data = json.loads(response.text)
© www.soinside.com 2019 - 2024. All rights reserved.