如何从 Google Drive API 的批处理请求处理以 HTML（mimetype 应用程序/zip）导出的工作区文档的响应主体

Question

对于每个 Google Drive Batch Request，我能够以纯文本格式将 100 个 Google Workspace 文档导出到我的服务器。我遵循了 Kanshi Tanaike 的优秀示例“使用 Google Apps 脚本批量请求进行高效文件管理”

但是，对于以 HTML（而不是纯文本）导出的 Google 文档，我不知道如何处理响应正文。（顺便说一句，对于 HTML 导出，mime 类型是 application/zip）。

我希望有人能提供一些关于如何处理响应主体的基本信息。或者我可以遵循 Google Apps 脚本或 Elixir 中的示例？

我尝试使用批处理标记将响应主体拆分为 100 个请求（就像我对纯文本示例所做的那样）。我剩下的可能是一个或 100 个 zip 文件，但我尝试解压缩它们的所有内容都出现了错误。我认为我错误地拆分了 response.body。我不是经验丰富的程序员，也没有使用 zip 文件的经验。我什至尝试使用解压缩实用程序打开 split response.body 但没有成功。

请注意，我可以处理单个工作区文档导出的 response.body，在 HTML 中，而不是在批处理请求中，使用：

url = "https://www.googleapis.com/drive/v3/files/#{doc_id}/export?mimeType=application/zip"

这是因为响应非常清楚地是一组元组 {item_name, text_string) （对于 HTML 部分或文档，可以直接处理）或 {item_name, byte_sequence} （例如，文档中的图像).

其实，我目前只对HTML部分感兴趣，而不是图像，即使是批量导出（有没有办法在批量请求中只导出HTML？）

Answer 1

我相信你的目标如下。

您想使用 Google Apps 脚本将多个 Google 文档导出为
```
application/zip
```
和批处理请求。

修改点：

现阶段批量请求的响应值如下

  --batch_###
  Content-Type: application/http
  Content-ID: response-1

  HTTP/1.1 200 OK
  Content-Disposition: attachment
  Content-Type: application/zip
  Date: ###
  Expires: ###
  Cache-Control: private, max-age=0
  Content-Length: 1000

  ### data ###
  --batch_###--

我认为
```
### data ###
```
是作为二进制数据的zip文件。在这种情况下，当使用
```
res.getContentText()
```
检索值时，响应值将转换为字符串值。由此，即使在检索二进制数据时，检索到的数据也被破坏。我想这就是你本期
```
but I everything I have tried to unzip them has given an error.
```
.

为了正确解码检索到的数据，在这种情况下，需要对响应数据进行二进制级别的处理。在这种情况下，需要用字节数组来处理数据。

在这个答案中，我想提出一个简单的示例脚本来解码来自批处理请求的响应数据（在这种情况下，Google 文档文件导出为

application/zip

。）。

示例脚本：

请将以下脚本复制粘贴到Google Apps Script项目的脚本编辑器中，并设置好你的文件夹ID和文档ID。

并且，请在 Advanced Google 服务中启用 Drive API。

/**
 * Ref: https://tanaikech.github.io/2023/03/08/split-binary-data-with-search-data-using-google-apps-script/
 * Split byteArray by a search data.
 * @param {Array} baseData Input byteArray of base data.
 * @param {Array} searchData Input byteArray of search data using split.
 * @return {Array} An array including byteArray.
 */
function splitByteArrayBySearchData_(baseData, searchData) {
  if (!Array.isArray(baseData) || !Array.isArray(searchData)) {
    throw new Error("Please give byte array.");
  }
  const search = searchData.join("");
  const bLen = searchData.length;
  const res = [];
  let idx = 0;
  do {
    idx = baseData.findIndex((_, i, a) => [...Array(bLen)].map((_, j) => a[j + i]).join("") == search);
    if (idx != -1) {
      res.push(baseData.splice(0, idx));
      baseData.splice(0, bLen);
    } else {
      res.push(baseData.splice(0));
    }
  } while (idx != -1);
  return res;
}

/**
 * Ref: https://cloud.google.com/blog/topics/developers-practitioners/efficient-file-management-using-batch-requests-google-apps-script
 * Create a request body of batch requests and request it.
 * 
 * @param {Object} object Object for creating request body of batch requests.
 * @returns {Object} UrlFetchApp.HTTPResponse
 */
function batchRequests_(object) {
  const { batchPath, requests } = object;
  const boundary = "sampleBoundary12345";
  const lb = "\r\n";
  const payload = requests.reduce((r, e, i, a) => {
    r += `Content-Type: application/http${lb}`;
    r += `Content-ID: ${i + 1}${lb}${lb}`;
    r += `${e.method} ${e.endpoint}${lb}`;
    r += e.requestBody ? `Content-Type: application/json; charset=utf-8" ${lb}${lb}` : lb;
    r += e.requestBody ? `${JSON.stringify(e.requestBody)}${lb}` : "";
    r += `--${boundary}${i == a.length - 1 ? "--" : ""}${lb}`;
    return r;
  }, `--${boundary}${lb}`);
  const params = {
    muteHttpExceptions: true,
    method: "post",
    contentType: `multipart/mixed; boundary=${boundary}`,
    headers: { Authorization: "Bearer " + ScriptApp.getOAuthToken() },
    payload,
  };
  return UrlFetchApp.fetch(`https://www.googleapis.com/${batchPath}`, params);
}

// Please run this function.
function main() {
  const folderId = "###"; // Please set folder ID you want to put the files.
  // Please set your document Ids.
  const documentIds = [
    "### Document ID1 ###",
    "### Document ID2 ###",
    "### Document ID3 ###",
    ,
    ,
    ,
  ];

  // Run batch requests.
  const requests = documentIds.map((id) => ({
    method: "GET",
    endpoint: `https://www.googleapis.com/drive/v3/files/${id}/export?mimeType=application/zip`,
  }));
  const object = { batchPath: "batch/drive/v3", requests };
  const res = batchRequests_(object);
  if (res.getResponseCode() != 200) {
    throw new Error(res.getContentText());
  }

  // Parse data as binary data, and create the data as Blob.
  const check = res.getContentText().match(/--batch.*/);
  if (!check) {
    throw new Error("Valid response value is not returned.");
  }
  const search = check[0];
  const baseData = res.getContent();
  const searchData = Utilities.newBlob(search).getBytes();
  const res1 = splitByteArrayBySearchData_(baseData, searchData);
  res1.shift();
  res1.pop();
  const blobs = res1.map((e, i) => {
    const rrr = splitByteArrayBySearchData_(e, [13, 10, 13, 10]);
    const data = rrr.pop();
    const metadata = Utilities.newBlob(rrr.flat()).getDataAsString();
    const dataSize = Number(metadata.match(/Content-Length:(.*)/)[1]);
    return Utilities.newBlob(data.splice(0, dataSize)).setName(`sampleName${i + 1}.zip`);
  });

  // Create blobs as the files in Google Drive.
  const folder = DriveApp.getFolderById(folderId);
  blobs.forEach(b => {
    if (b) {
      console.log({ filename: b.getName(), fileSize: b.getBytes().length })
      folder.createFile(b);
    }
  });
}

运行此脚本时，将在文件夹中创建包含从 Google 文档转换的 HTML 数据的 zip 文件。并且，示例文件名是
```
sampleName1.zip
```
,
```
sampleName2.zip
```
,
```
sampleName3.zip
```
,,,.

注：

重要提示：我不确定这种方法是否可以用于 100 个批量请求。因为，当响应大小超过 50 MB 时，可能会发生错误。我很担心这个。因此，当您测试此脚本时，请使用少量示例 Google 文档来测试脚本。
我刚才注意到了
```
I am at the moment only interested in the HTML part rather than images
```
。作为另一种方法，当
```
mimeType=application/zip
```
更改为
```
mimeType=text/html
```
时，似乎只有 HTML 数据作为字符串包含在响应值中。在这种情况下，响应数据可以解析为字符串。

参考：

使用 Google Apps 脚本批量请求进行高效文件管理（作者：我）

如何从 Google Drive API 的批处理请求处理以 HTML（mimetype 应用程序/zip）导出的工作区文档的响应主体

问题描述投票：0回答：1

1个回答

修改点：

示例脚本：

注：

参考：

最新问题

如何从 Google Drive API 的批处理请求处理以 HTML（mimetype 应用程序/zip）导出的工作区文档的响应主体

问题描述 投票：0回答：1

1个回答

修改点：

示例脚本：

注：

参考：

最新问题

问题描述投票：0回答：1