如何使用.findElement（DocumentApp.ElementType.TABLE_OF_CONTENTS）来获取和解析文档的目录元素

Question

我的目标是解析Google文档中的TableOfContents元素并将其写入另一个元素。我想对文件夹中的每个文档执行此操作。麻烦的是将每个文档转换为DocsList生成的类型，这样我就可以使用此方法[DocumentApp生成的文档没有这种方法。为什么，我不明白，因为否则在查找零件时，这两个“文档”是相似的。 ]，我发现我得到的是一个SearchResult。如何使用这种难以捉摸的结构？我尝试将其转换为TableOfContents元素[ele = searchResult.asTableOfContents（）]，不会出错，但是我没有做任何允许我解析其子元素以恢复其文本作品的操作。有趣的是，如果通过解析文档的段落来获取TableOfContents元素来获取它，那么就可以解析目录。

有人会说这个问题。我肯定会喜欢一个代码片段，因为我一无所获，并且已经花了几个小时了。

Answer 1

asTableOfContents()方法仅用于帮助编辑器的自动完成功能。它对运行时没有影响，并且不能用于强制转换为其他类型。（请参见ContainerElement documentation。）

要解析目录，请首先从SearchResult中检索元素。下面是一个示例，它遍历文档目录中的项目以产生一系列项目信息。

示例文档

“截图”

解析结果

在带有几个标题和目录的简单文档中，其内容是这样的：

[13-08-20 16:31:56:415 EDT] 
[
  {text=Heading 1.0, linkUrl=#heading=h.50tkhklducwk, indentFirstLine=18.0, indentStart=18.0},
  {text=Heading 1.1, linkUrl=#heading=h.ugj69zpoikat, indentFirstLine=36.0, indentStart=36.0},
  {text=Heading 1.2, linkUrl=#heading=h.xb0y0mu59rag, indentFirstLine=36.0, indentStart=36.0},
  {text=Heading 2.0, linkUrl=#heading=h.gebx44eft4kq, indentFirstLine=18.0, indentStart=18.0}
]

代码

function test_parseTOC() {
  var fileId = '--Doc-ID--';
  Logger.log( parseTOC( fileId ) );
}

function parseTOC( docId ) {
  var contents = [];
  var doc = DocumentApp.openById(docId);

  // Define the search parameters.
  var searchElement  = doc.getBody();
  var searchType = DocumentApp.ElementType.TABLE_OF_CONTENTS;

  // Search for TOC. Assume there's only one.
  var searchResult = searchElement.findElement(searchType);

  if (searchResult) {
    // TOC was found
    var toc = searchResult.getElement().asTableOfContents();

    // Parse all entries in TOC. The TOC contains child Paragraph elements,
    // and each of those has a child Text element. The attributes of both
    // the Paragraph and Text combine to make the TOC item functional.
    var numChildren = toc.getNumChildren();
    for (var i=0; i < numChildren; i++) {
      var itemInfo = {}
      var tocItem = toc.getChild(i).asParagraph();
      var tocItemAttrs = tocItem.getAttributes();
      var tocItemText = tocItem.getChild(0).asText();

      // Set itemInfo attributes for this TOC item, first from Paragraph
      itemInfo.text = tocItem.getText();                // Displayed text
      itemInfo.indentStart = tocItem.getIndentStart();  // TOC Indentation
      itemInfo.indentFirstLine = tocItem.getIndentFirstLine();
      // ... then from child Text
      itemInfo.linkUrl = tocItemText.getLinkUrl();      // URL Link in document
      contents.push(itemInfo);
    }
  }

  // Return array of objects containing TOC info
  return contents;
}

坏消息

坏消息是，您只能对脚本中的目录进行限制。您不能插入目录或将新项目添加到现有目录。

请参见问题跟踪器中的Issue 2502，并为它加注星标以获取更新。

[如果您可以发布代码或使用DocsList与DocumentApp解释您的问题，可以进行查看。 Google文档的元素只能通过DocumentApp进行操作。

Answer 2

我修改了上面的代码，以便仅在具有所需级别（即h1，h2）的表中重新创建TOC。唯一的警告是在运行此目录之前必须存在TOC并进行更新。

function findToc(body, level = 2) {
  const indent = 18;
  let contents = [];

  const tocType = TABLE_OF_CONTENTS;
  const tocContainer = body.findElement(tocType);

  if (tocContainer) {
    // TOC was found
    const toc = tocContainer.getElement().asTableOfContents();
    const totalLines = toc.getNumChildren();

    for (let lineIndex = 0; lineIndex < totalLines; lineIndex++) {
      const tocItem = toc.getChild(lineIndex).asParagraph();
      const { INDENT_START } = tocItem.getAttributes();

      const isDesiredLevel = Number(INDENT_START) <= indent * (level - 1);

      if (isDesiredLevel) {
        contents.push(tocItem.copy());
      }
    }

  }

  return contents;
}

function addToTable(cellText) {
  body = DocumentApp.openById(docId).getBody();

  const table = body.appendTable();
  const tr = table.insertTableRow(0);
  const td = tr.insertTableCell(0);

  cellText.forEach(text => {
    td.appendParagraph(text);
  })
}

function parseTOC(docId) {
  body = DocumentApp.openById(docId).getBody();
  const contents = findToc(body);
  addToTable(contents);
}

如何使用.findElement（DocumentApp.ElementType.TABLE_OF_CONTENTS）来获取和解析文档的目录元素

问题描述投票：2回答：2

2个回答

示例文档

解析结果

代码

坏消息

最新问题

如何使用.findElement（DocumentApp.ElementType.TABLE_OF_CONTENTS）来获取和解析文档的目录元素

问题描述 投票：2回答：2

2个回答

示例文档

解析结果

代码

坏消息

最新问题

问题描述投票：2回答：2