Pickling 添加新属性的子类似乎不包含新属性。 例如:
import requests
import pickle
class MySession(requests.Session):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.details = False
data = MySession()
file = "/tmp/pickle"
with open(file, "wb") as f:
from pprint import pprint
print(f"Saving session {data} to {file} with {data.details}")
pprint(data.__dict__)
pickle.dump(data, f)
print("Validating")
with open(file, "rb") as f:
newdata = pickle.load(f)
pprint(newdata.__dict__)
print(data.details)
print(newdata.details)
产生以下异常:
Saving session <__main__.MySession object at 0x7f0976cb9400> to /tmp/pickle with False
{'adapters': OrderedDict([('https://',
<requests.adapters.HTTPAdapter object at 0x7f0975dca4e0>),
('http://',
<requests.adapters.HTTPAdapter object at 0x7f0976be9ca0>)]),
'auth': None,
'cert': None,
'cookies': <RequestsCookieJar[]>,
'details': False,
'headers': {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, br', 'Accept': '*/*', 'Connection': 'keep-alive'},
'hooks': {'response': []},
'max_redirects': 30,
'params': {},
'proxies': {},
'stream': False,
'trust_env': True,
'verify': True}
Validating
{'adapters': OrderedDict([('https://',
<requests.adapters.HTTPAdapter object at 0x7f09759c1010>),
('http://',
<requests.adapters.HTTPAdapter object at 0x7f0975a4e960>)]),
'auth': None,
'cert': None,
'cookies': <RequestsCookieJar[]>,
'headers': {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, br', 'Accept': '*/*', 'Connection': 'keep-alive'},
'hooks': {'response': []},
'max_redirects': 30,
'params': {},
'proxies': {},
'stream': False,
'trust_env': True,
'verify': True}
False
Traceback (most recent call last):
File "/tmp/repro.py", line 22, in <module>
print(newdata.details)
^^^^^^^^^^^^^^^
AttributeError: 'MySession' object has no attribute 'details'
details
的 MySession
属性未保存在 pickle 文件中。 但为什么呢?
忽略这一行下面的内容: 现在我只是要闲聊,因为 SO 认为我的帖子包含太多代码而没有足够的细节,但是当一张图片值一千个字时,为什么我们需要在一张非常不言自明的图片上添加 10,000 个字?
这是因为
requests.Session
类定义了自定义 __getstate__
和 __setstate__
方法:
def __getstate__(self):
state = {attr: getattr(self, attr, None) for attr in self.__attrs__}
return state
def __setstate__(self, state):
for attr, value in state.items():
setattr(self, attr, value)
并且
self.__attrs__
是一个类变量,定义如下:
class Session(SessionRedirectMixin):
"""A Requests session.
Provides cookie persistence, connection-pooling, and configuration.
Basic Usage::
>>> import requests
>>> s = requests.Session()
>>> s.get('https://httpbin.org/get')
<Response [200]>
Or as a context manager::
>>> with requests.Session() as s:
... s.get('https://httpbin.org/get')
<Response [200]>
"""
__attrs__ = [
"headers",
"cookies",
"auth",
"proxies",
"hooks",
"params",
"verify",
"cert",
"adapters",
"stream",
"trust_env",
"max_redirects",
]
因此,一种解决方案是实现如下所示的
__getstate__
:
def __getstate__(self):
state = super().__getstate__()
state['details'] = self.details
return state