从Google colab笔记本中提取Google Drive zip

问题描述 投票:5回答:5

我已经在谷歌硬盘上有一个(2K图像)数据集的拉链。我必须在ML训练算法中使用它。 Code下面以字符串格式提取内容:

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import io
import zipfile
# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Download a file based on its file ID.
#
# A file ID looks like: laggVyWshwcyP6kEI-y_W3P8D26sz
file_id = '1T80o3Jh3tHPO7hI5FBxcX-jFnxEuUE9K' #-- Updated File ID for my zip
downloaded = drive.CreateFile({'id': file_id})
#print('Downloaded content "{}"'.format(downloaded.GetContentString(encoding='cp862')))

但我必须将其提取并存储在一个单独的目录中,因为它更容易处理(以及理解)数据集。

我试图进一步提取它,但得到“不是zipfile错误”

dataset = io.BytesIO(downloaded.encode('cp862'))
zip_ref = zipfile.ZipFile(dataset, "r")
zip_ref.extractall()
zip_ref.close()

Google Drive Dataset

注意:数据集仅供参考,我已经将此zip文件下载到我的google驱动器中,而且我只是指我的驱动器中的文件。

python machine-learning google-drive-api google-colaboratory zipfile
5个回答
8
投票

你可以简单地使用它

!unzip file_location

4
投票

要从Google colab笔记本中提取Google Drive zip:

import zipfile
from google.colab import drive

drive.mount('/content/drive/')

zip_ref = zipfile.ZipFile("/content/drive/My Drive/ML/DataSet.zip", 'r')
zip_ref.extractall("/tmp")
zip_ref.close()

2
投票

Mount GDrive:

from google.colab import drive
drive.mount('/content/gdrive')

打开链接 - >复制授权码 - >将其粘贴到提示中并按“Enter”

检查GDrive访问:

!ls "/content/gdrive/My Drive"

来自GDrive的Unzip(q - 相当!)文件:

!unzip -q "/content/gdrive/My Drive/dataset.zip"

1
投票

而不是GetContentString(),而是使用GetContentFile()。它将保存文件而不是返回字符串。

downloaded.GetContentFile('images.zip') 

然后你可以用unzip解压缩它。


1
投票

简单的连接方式

1)您必须验证身份验证

from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()

2)融合谷歌硬盘

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

3)验证凭据

import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

4)创建一个驱动器名称以在colab('gdrive')中使用它并检查它是否正常工作

!mkdir gdrive
!google-drive-ocamlfuse gdrive
!ls gdrive
!cd gdrive
© www.soinside.com 2019 - 2024. All rights reserved.