我的代码通常可以正常工作,但是将tar文件写入远程文件系统时遇到了一些麻烦。下面的代码应该将大型字典序列化为json并写入压缩文件对象。命名的临时文件是可选的,因为我也可以写入文件系统上的永久文件。 fs是gcsfs.GCSFileSystem对象。它支持将文件复制到Google云存储的put方法。
def write_main(fs, remote_fp, data):
"""
input -
fs filesystem object
fp filepath or path object
data object
output - bool
ref: https://stackoverflow.com/questions/39109180/dumping-json-directly-into-a-tarfile
"""
tmp_file = NamedTemporaryFile()
filename = tmp_file.name
with io.BytesIO() as out_stream, tarfile.open(filename, 'w|gz', out_stream) as tar_file:
out_stream.write(json.dumps(data).encode())
tar_file.size = out_stream.tell()
out_stream.seek(0)
tar_file.addfile(tar_file, out_stream)
fs.put(filename, remote_fp)
当我尝试测试功能代码时出现以下错误:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-23-020281a8b588> in <module>
3 tar_file.size = out_stream.tell()
4 out_stream.seek(0)
----> 5 tar_file.addfile(tar_file, out_stream)
6
7 fs.put(filename, remote_fp)
~/anaconda3/lib/python3.7/tarfile.py in addfile(self, tarinfo, fileobj)
1964 tarinfo = copy.copy(tarinfo)
1965
-> 1966 buf = tarinfo.tobuf(self.format, self.encoding, self.errors)
1967 self.fileobj.write(buf)
1968 self.offset += len(buf)
AttributeError: 'TarFile' object has no attribute 'tobuf'
我认为您为tar_file传递了错误的参数。它应该是一个TarInfo对象。这就是为什么引发no属性“ tobuf”错误的原因。 link
def addfile(self, tarinfo, fileobj=None): """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is given, tarinfo.size bytes are read from it and added to the archive. You can create TarInfo objects using gettarinfo(). On Windows platforms, `fileobj' should always be opened with mode 'rb' to avoid irritation about the file size. """