如何使用 Python 和 Drive API v3 从 Google Drive 下载文件

问题描述 投票:0回答:2

我尝试使用 python 脚本将文件从 Google Drive 下载到本地系统,但在运行 Python 脚本时遇到“禁止”问题。脚本如下:

import requests

url = "https://www.googleapis.com/drive/v3/files/1wPxpQwvEEOu9whmVVJA9PzGPM2XvZvhj?alt=media&export=download"

querystring = {"alt":"media","export":"download"}

headers = {
    'Authorization': "Bearer TOKEN",

    'Host': "www.googleapis.com",
    'Accept-Encoding': "gzip, deflate",
    'Connection': "keep-alive",
    }

response = requests.request("GET", url, headers=headers, params=querystring)

print(response.url)
#
import wget
import os
from os.path import expanduser


myhome = expanduser("/home/sunarcgautam/Music")
### set working dir
os.chdir(myhome)

url = "https://www.googleapis.com/drive/v3/files/1wPxpQwvEEOu9whmVVJA9PzGPM2XvZvhj?alt=media&export=download"
print('downloading ...')
wget.download(response.url)

在这个脚本中,我遇到了禁止的问题。我在脚本中做错了什么吗?

我还尝试了在 Google Developer 页面上找到的另一个脚本,如下:

import auth
import httplib2
SCOPES = "https://www.googleapis.com/auth/drive.scripts"
CLIENT_SECRET_FILE = "client_secret.json"
APPLICATION_NAME = "test_Download"
authInst = auth.auth(SCOPES, CLIENT_SECRET_FILE, APPLICATION_NAME)
credentials = authInst.getCredentials()
http = credentials.authorize(httplib2.Http())
drive_serivce = discovery.build('drive', 'v3', http=http)

file_id = '1Af6vN0uXj8_qgqac6f23QSAiKYCTu9cA'
request = drive_serivce.files().export_media(fileId=file_id,
                                             mimeType='application/pdf')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print ("Download %d%%." % int(status.progress() * 100))

此脚本给我一个 URL 不匹配错误

那么 Google 控制台凭据中的重定向 URL 应该给出什么?或该问题的任何其他解决方案?我是否必须在这两个脚本中从 Google 授权我的 Google 控制台应用程序?如果是这样,授权该应用程序的过程是什么,因为我还没有找到任何相关文件。

python google-drive-api
2个回答
26
投票

要向 Google API 发出请求,工作流程本质上如下:

  1. 前往开发者控制台,如果尚未登录,请登录。
  2. 创建云平台项目。
  3. 为您的项目启用您有兴趣与项目应用程序一起使用的 API(例如:Google Drive API)。
  4. 创建并下载OAuth 2.0客户端ID凭据,这将使您的应用程序获得使用已启用的API的授权。
  5. 前往 OAuth 同意屏幕,单击 enter image description here 并使用 enter image description here 按钮添加您的范围。 (范围:https://www.googleapis.com/auth/drive.readonly)。根据您的需要选择内部/外部,暂时忽略警告(如果有)。
  6. 为了获取发出 API 请求的有效令牌,应用程序将通过 OAuth 流程来接收授权令牌。 (因为需要同意)
  7. 在 OAuth 流程期间,用户将被重定向到您的 OAuth 同意屏幕,其中将要求批准或拒绝访问您的应用程序请求的范围。
  8. 如果获得同意,您的应用程序将收到授权令牌。
  9. 将请求中的令牌传递到您授权的 API 端点。[2]
  10. 构建一个 Drive Service 来发出 API 请求(您将需要有效的令牌)[1]

注意:

Drive API v3 的文件资源的可用方法是此处

使用 Python Google API 客户端时,您可以使用

export_media()
get_media()
作为 Python 文档的 Google API 客户端


重要:

此外,检查您正在使用的范围是否确实允许您执行您想要的操作(从用户的驱动器下载文件)并进行相应的设置。 ATM 你的目标范围不正确。请参阅OAuth 2.0 API 范围


示例代码参考:

  1. 构建驱动服务:
import google_auth_oauthlib.flow
from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
 
 
class Auth:
 
    def __init__(self, client_secret_filename, scopes):
        self.client_secret = client_secret_filename
        self.scopes = scopes
        self.flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(self.client_secret, self.scopes)
        self.flow.redirect_uri = 'http://localhost:8080/'
        self.creds = None
 
    def get_credentials(self):
        flow = InstalledAppFlow.from_client_secrets_file(self.client_secret, self.scopes)
        self.creds = flow.run_local_server(port=8080)
        return self.creds

 
# The scope you app will use. 
# (NEEDS to be among the enabled in your OAuth consent screen)
SCOPES = "https://www.googleapis.com/auth/drive.readonly"
CLIENT_SECRET_FILE = "credentials.json"
 
credentials = Auth(client_secret_filename=CLIENT_SECRET_FILE, scopes=SCOPES).get_credentials()
 
drive_service = build('drive', 'v3', credentials=credentials)
  1. 发出导出或获取文件的请求
request = drive_service.files().export(fileId=file_id, mimeType='application/pdf')

fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print("Download %d%%" % int(status.progress() * 100))

# The file has been downloaded into RAM, now save it in a file
fh.seek(0)
with open('your_filename.pdf', 'wb') as f:
    shutil.copyfileobj(fh, f, length=131072)

0
投票

我通常使用两个文件来实现模块化:

gdrive_credentials.py

from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
import os

# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/drive.readonly'] # Or drive if you need write access

def get_credentials(credentials_json = "credentials.json", token_json = "token.json"):
    """Gets or creates Google Drive API credentials.

    Args:
        credentials_json: The filename of the credentials file.
        token_json: The filename of the token file

    Returns:
        A Credentials object, or None if an error occurred.
    """
    creds = None

    if os.path.exists(token_json):
        creds = Credentials.from_authorized_user_file(token_json, SCOPES)

    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            try:
                creds.refresh(Request())
            except Exception as e: # Catch exceptions during refresh
                print(f"Error refreshing credentials: {e}")
                return None
        else:
            if not os.path.exists(credentials_json):
                print(f"Credentials file '{target}' not found. Please download it from Google Cloud Console.")
                return None
            flow = InstalledAppFlow.from_client_secrets_file(credentials_json, SCOPES)
            try:
                creds = flow.run_local_server(port=0)
            except Exception as e: # Catch exceptions during auth flow
                print(f"Error during authorization flow: {e}")
                return None
        with open(token_json, 'w') as token_file:
            token_file.write(creds.to_json())
    return creds

if __name__ == '__main__':
    _ = get_credentials()

gdrive_wget.py

import google_auth_oauthlib.flow
from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload
import io
import shutil

def download_file_from_drive(file_id, output_filename, drive_service):
    try:
        file_metadata = drive_service.files().get(fileId=file_id, fields='mimeType, name').execute()
        mime_type = file_metadata.get('mimeType')
        file_name = file_metadata.get('name')
        print(f"Downloading file: {file_name} (Mime Type: {mime_type})")

        if mime_type == 'application/vnd.google-apps.document':
            request = drive_service.files().export(fileId=file_id, mimeType='text/plain')
        elif mime_type == 'application/vnd.google-apps.spreadsheet':
            request = drive_service.files().export(fileId=file_id, mimeType='text/csv') # Example for Sheets
        elif mime_type == 'application/vnd.google-apps.presentation':
            request = drive_service.files().export(fileId=file_id, mimeType='application/pdf') # Example for Slides
        else:
            request = drive_service.files().get_media(fileId=file_id)

        # Download the file into RAM
        fh = io.BytesIO()
        downloader = MediaIoBaseDownload(fh, request)
        done = False
        while done is False:
            status, done = downloader.next_chunk()
            print("Download %d%%" % int(status.progress() * 100))

        # The file has been downloaded into RAM, now save it in a file
        fh.seek(0)
        with open(output_target, 'wb') as f:
            shutil.copyfileobj(fh, f, length=131072)

    except HttpError as error:
        print(f'An HTTP error occurred: {error}')
    except Exception as e:
        print(f'A general error occurred: {e}')
    return None

if __name__ == '__main__': 
    import scryb_credentials
    credentials = scryb_credentials.get_credentials() 

    drive_service = build('drive', 'v3', credentials=credentials)

    import sys
    file_id = sys.argv[1]
    output_target = sys.argv[2]

    download_file_from_drive(file_id, output_target, drive_service)
© www.soinside.com 2019 - 2024. All rights reserved.