我的函数是用线程执行的:
def getdata(self, page, ...):
tries = 10
for n in range(tries):
try:
...
datarALL = []
url = 'http://website/...'.format(...)
responsedata = requests.get(url, data=data, headers=self.hed, verify=False)
responsedata.raise_for_status()
if responsedata.status_code == 200: # 200 for successful call
...
if ...
break
except (ChunkedEncodingError, requests.exceptions.HTTPError) as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
如下:
with ThreadPoolExecutor(max_workers=num_of_workers) as executor:
futh = [(executor.submit(self.getdata, page,...)) for page in pages]
for data in as_completed(futh):
datarALL.extend(data.result())
print ("Finished generateing data.")
return datarALL
有时我会遇到意外的例外情况:ConnectionResetError: [Errno 104] Connection reset by peer
关闭我的程序。我想更改我的代码,以便无论发生哪个异常,线程都将重试,直到满足if n == tries - 1:
。由于随机异常,我不希望我的线程关闭。
我读了request exceptions info page,但是我没有看到如何在不手动列出所有异常的情况下捕获所有异常。有没有通用的方法来做到这一点?
基本上我想要的东西:
except (ALL EXCEPTIONS from Requests) as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
我怎样才能做到这一点?
编辑:使用
except Exception as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
不抓住。它给了我这个:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.5/site-packages/urllib3/response.py", line 331, in _error_catcher
yield
File "/home/ubuntu/.local/lib/python3.5/site-packages/urllib3/response.py", line 640, in read_chunked
chunk = self._handle_chunk(amt)
File "/home/ubuntu/.local/lib/python3.5/site-packages/urllib3/response.py", line 595, in _handle_chunk
returned_chunk = self._fp._safe_read(self.chunk_left)
File "/usr/lib/python3.5/http/client.py", line 607, in _safe_read
chunk = self.fp.read(min(amt, MAXAMOUNT))
File "/usr/lib/python3.5/socket.py", line 575, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer
循环不会重试。运行终止......
Aaditi:
except requests.exceptions.RequestException as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
也没有赶上上面列出的ConnectionResetError: [Errno 104] Connection reset by peer
。
捕获所有异常通常被认为是一种不好的做法,因为它可以隐藏一些问题。
也就是说,Python异常受益于继承和捕获基本异常将捕获从此基本异常继承的每个异常。
有关详细信息,请参阅the Python standard exception hierarchy。
您可以看到根异常是BaseException
,但是这个异常永远不会被捕获,因为它会捕获Ctrl+C
中断和生成器退出。如果你想捕获每个异常类型,你可以捕获Exception
。
你也可以想从requests
只捕获异常。在那种情况下,根据the doc,它可以通过捕获requests
模块的基本异常来完成:RequestException
如果你想同时捕获requests
异常和ConnectionResetError
(这是一个Python标准异常),你必须在except
子句中指定它们:
except (requests.exceptions.RequestException,
ConnectionResetError) as err:
# some code
或者,如果您想要不那么具体并捕获所有可能的连接错误,您可以使用ConnectionError
而不是ConnectionResetError
。 (见exceptions hierarchy)
最后,您可能希望对每种异常类型做出不同的反应。在这种情况下,您可以这样做:
try:
# something
except ConnectionError as err:
# manage connection errors
except requests.exceptions.RequestException as err:
# manage requests errors