无法将文件读取流从Google Cloud Storage传输到Google Drive API

问题描述 投票:3回答:1

我正在一个项目中,正在从Google云端硬盘读取使用SimpleMind创建的思维导图文件,修改这些文件,然后将其上传回Google云端硬盘。

SimpleMind创建的SMMX文件是包含XML文件和媒体文件的zip文件。

我的程序在本地运行时运行良好,对思维导图所做的更改显示在SimpleMind中。

我现在想使用App Engine在Google Cloud Platform上运行程序。

由于安全限制,我不能只是将从Google云端硬盘下载的文件写入云中的应用服务器的文件系统。相反,我创建了一个存储桶来存储文件。

但是,当我运行程序后,文件被损坏,而不是zip文件内容,而是JSON文件,显然是读取流的字符串表示形式。

本地运行–运行中

这是我的代码的简化版本,没有实际修改zip文件,我将其省略,因为它与问题以及任何错误处理均无关紧要-永远不会有任何错误。

在本地运行代码时,我使用写入流和读取流将文件保存并加载到本地文件系统上:

#!/usr/bin/env node

const { readFileSync, createReadStream, createWriteStream } = require('fs');
const { google } = require('googleapis');

const tokenPath = 'google-drive-token.json';
const clientId = 'xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com';
const redirectUri = 'urn:ietf:wg:oauth:2.0:oob';
const clientSecret = 'xxxxxxxxxxxxxxxxxxxxxxxx';
const fileId = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
const fileName = 'deleteme.smmx';

(async () => {
  const auth = new google.auth.OAuth2(clientId, clientSecret, redirectUri);
  const token = JSON.parse(readFileSync(tokenPath));
  auth.setCredentials(token);
  const writeStream = createWriteStream(fileName);
  const drive = google.drive({ version: 'v3', auth });
  let progress = 0;
  const res = await drive.files.get({ fileId, alt: 'media' }, { responseType: 'stream' });
  await new Promise(resolve => {
    res.data.on('data', d => (progress += d.length)).pipe(writeStream);
    writeStream.on('finish', () => {
      console.log(`Done downloading file ${fileName} from Google Drive to local file system (${progress} bytes)`);
      resolve();
    });
  });
  const readStream = createReadStream(fileName);
  progress = 0;
  const media = {
    mimeType: 'application/x-zip',
    body: readStream
      .on('data', d => {
        progress += d.length;
      })
      .on('end', () => console.log(`${progress} bytes read from local file system`))
  };
  await drive.files.update({
    fileId,
    media
  });
  console.log(`File ${fileName} successfully uploaded to Google Drive`);
})();

当我在本地运行此脚本时,它运行良好,程序输出始终为:

已完成将文件deleteme.smmx从Google云端硬盘下载到本地文件系统(371字节)

从本地文件系统读取的371字节

文件deleteme.smmx已成功上传到Google云端硬盘

我可以运行任意次,每次都会在Google云端硬盘上创建文件的新版本,每个文件的大小为371字节。

在Google云中运行-无法运行

这里是上面脚本的一个版本,我用来尝试执行相同的操作,在App Engine上运行的Google Cloud中从Google Drive下载文件并将文件上传到Google Drive,]

const { readFileSync } = require('fs');
const { google } = require('googleapis');
const { Storage } = require('@google-cloud/storage');

const tokenPath = 'google-drive-token.json';
const clientId = 'xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com';
const redirectUri = 'urn:ietf:wg:oauth:2.0:oob';
const clientSecret = 'xxxxxxxxxxxxxxxxxxxxxxxx';
const fileId = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
const fileName = 'deleteme.smmx';
const storageBucketId = 'xxxxxxxxxxx';

module.exports = async () => {
  const auth = new google.auth.OAuth2(clientId, clientSecret, redirectUri);
  const token = JSON.parse(readFileSync(tokenPath));
  auth.setCredentials(token);
  const storage = new Storage();
  const bucket = storage.bucket(storageBucketId);
  const file = bucket.file(fileName);
  const writeStream = file.createWriteStream({ resumable: false });
  const drive = google.drive({ version: 'v3', auth });
  let progress = 0;
  const res = await drive.files.get({ fileId, alt: 'media' }, { responseType: 'stream' });
  await new Promise(resolve => {
    res.data.on('data', d => (progress += d.length)).pipe(writeStream);
    writeStream.on('finish', () => {
      console.log(`Done downloading file ${fileName} from Google Drive to Cloud bucket (${progress} bytes)`);
      resolve();
    });
  });
  const readStream = file.createReadStream();
  progress = 0;
  const media = {
    mimeType: 'application/x-zip',
    body: readStream
      .on('data', d => {
        progress += d.length;
      })
      .on('end', () => console.log(`${progress} bytes read from storage`))
  };
  await drive.files.update({
    fileId,
    media
  });
  console.log(`File ${fileName} successfully uploaded to Google Drive`);
  return 0;
};

这里唯一的区别是,我不是使用Node.js createWriteStream模块中的createReadStreamfs,而是使用Google Cloud Storage库中的相应方法file.createWriteStreamfile.createReadStream。] >

[当我第一次在云中的App Engine上运行此代码时,一切似乎都正常,输出与在本地运行时相同:

已完成将文件deleteme.smmx从Google云端硬盘下载到云存储桶(371字节)

从存储中读取的371字节

文件deleteme.smmx已成功上传到Google云端硬盘

但是,当我在Google云端硬盘网络前端上查看文件的最新版本时,它不再是我的smmx文件,而是一个JSON文件,它看起来像读取流的字符串表示形式:

{
  "_readableState": {
    "objectMode": false,
    "highWaterMark": 16384,
    "buffer": { "head": null, "tail": null, "length": 0 },
    "length": 0,
    "pipes": null,
    "pipesCount": 0,
    "flowing": true,
    "ended": false,
    "endEmitted": false,
    "reading": false,
    "sync": false,
    "needReadable": true,
    "emittedReadable": false,
    "readableListening": false,
    "resumeScheduled": true,
    "paused": false,
    "emitClose": true,
    "destroyed": false,
    "defaultEncoding": "utf8",
    "awaitDrain": 0,
    "readingMore": false,
    "decoder": null,
    "encoding": null
  },
  "readable": true,
  "_events": {},
  "_eventsCount": 4,
  "_writableState": {
    "objectMode": false,
    "highWaterMark": 16384,
    "finalCalled": false,
    "needDrain": false,
    "ending": false,
    "ended": false,
    "finished": false,
    "destroyed": false,
    "decodeStrings": true,
    "defaultEncoding": "utf8",
    "length": 0,
    "writing": false,
    "corked": 0,
    "sync": true,
    "bufferProcessing": false,
    "writecb": null,
    "writelen": 0,
    "bufferedRequest": null,
    "lastBufferedRequest": null,
    "pendingcb": 0,
    "prefinished": false,
    "errorEmitted": false,
    "emitClose": true,
    "bufferedRequestCount": 0,
    "corkedRequestsFree": { "next": null, "entry": null }
  },
  "writable": true,
  "allowHalfOpen": true,
  "_transformState": {
    "needTransform": false,
    "transforming": false,
    "writecb": null,
    "writechunk": null,
    "writeencoding": null
  },
  "_destroyed": false
}

[似乎从云端存储桶将读取流传输到写入流以上传到Google云端硬盘,但无法按照我希望的方式工作。

我在做什么错?我需要更改什么才能使我的代码在云中正确运行?

如果您有兴趣,请输入full source code of my project can be found on GitHub

更新:解决方法

我已经找到解决此问题的方法:

  • 将数据从云存储桶的读取流中读取到缓冲区中
  • 从此缓冲区as described in this tutorial创建可读流
  • 将此“缓冲流”传递给drive.files.update方法
  • 这样,Google云端硬盘上的zip文件不会损坏,新版本的存储内容与以前相同,符合预期。

    但是,我觉得这很丑。对于大型思维导图文件,例如其中包含许多图像的文件将给服务器造成压力,因为文件的全部内容都必须存储在内存中。

我希望使从云存储桶到Google Drive API的直接管道有效。

我正在研究一个项目,该项目是从Google云端硬盘中读取使用SimpleMind创建的思维导图文件,修改这些文件,然后将其上传回Google云端硬盘。由...

node.js google-app-engine google-cloud-platform google-drive-api
1个回答
© www.soinside.com 2019 - 2024. All rights reserved.