这个问题在这里已有答案:
所以我使用Proxy Broker来抓取一些代理。有时候代理在被刮掉时已经死了所以我想在使用之前检查它们。所以我用Python请求编写了一个程序来检查它们。这里是:
import time
import random
import requests
lines = open('not_checked.txt').read().splitlines()
check =random.choice(lines)
yaya = {
check
}
for x in range(0 , 10):
requests.get('https://reg.ebay.com/reg/PartialReg?ru=https%3A%2F%2Fwww.ebay.com%2F': proxies=yaya)
r.status_code
print(status_code)
if status_code == 200:
f=open("checked_proxies.txt", "a+")
f.write(proxies)
else:
time.sleep(.001)
但是这会抛出“set object has no attribute get”。我在线查看了错误,它说这是因为我使用了逗号而不是冒号。那么我试过:
requests.get('https://reg.ebay.com/reg/PartialReg?ru=https%3A%2F%2Fwww.ebay.com%2F': proxies=yaya)
得到语法错误。到底是怎么回事?
proxies
需要成为一个词典。这是right there in the docs:
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
你的yaya
是set
而不是dict
。
肯定会有逗号而不是分号。示例代码段
import time
import random
import requests
lines = open('proxies.txt').read().splitlines()
# check =random.choice(lines)
proxies = [
{
"http": "XXX.XXX.XXX.XXX:XXXX",
"https": "XXX.XXX.XXX.XXX:XXXX",
},
{
"http": "XXX.XXX.XXX.XXX:XXXX",
"https": "XXX.XXX.XXX.XXX:XXXX",
},
{
"http": "XXX.XXX.XXX.XXX:XXXX",
"https": "XXX.XXX.XXX.XXX:XXXX",
}
]
for proxy in proxies:
print("Requesting with %s and %s"%(proxy['http'], proxy['https']))
r = requests.get('https://reg.ebay.com/reg/PartialReg?ru=https%3A%2F%2Fwww.ebay.com%2F', proxies=proxy)
print("Loaded")
r.status_code
print(r.status_code)
if r.status_code == 200:
f=open("checked_proxies.txt", "a+")
f.write(proxy)
else:
time.sleep(.001)