使用 epubjs 或 epub-parser 读取 epub 总是返回空

Question

我一直在尝试创建一个非常简单的网络应用程序，它可以读取 epub，并按部分显示每个章节。很简单。我尝试同时使用 epubjs 和 epub-parser，但每次我尝试使用几个 .epub 文件时，它都不会返回任何内容（错误、空等）。我尝试了验证器来确保我的 epub 格式良好等，一切都很好。当我用 WinZip 打开所有 epub 时，它看起来应该如此。

我很绝望，因为我不明白出了什么问题，也不明白为什么我不能简单地阅读 epub。

以下是我为此使用的两个函数。错误从 item.load() 开始，它不返回适当的值/对象。现在我的错误是：

Error extracting chapter content: TypeError: rawContent?.trim is not a function

很明显 rawContent 的格式不正确，但我不知道为什么以及如何修复它。

我还附上了书籍、书脊和元数据的日志，尽管我无法识别出任何错误。

任何帮助或建议将不胜感激。谢谢！

书籍和书脊和元数据

export class EpubService {
  private static async parseEpubContent(arrayBuffer: ArrayBuffer): Promise<{
    chapters: Chapter[];
    title: string;
  }> {
    try {
      const book = ePub(arrayBuffer);
      await book.ready;

      const spine = await book.loaded.spine;
      const metadata = await book.loaded.metadata;

      logger.info('book: ', book);
      logger.info('spine: ', spine);
      logger.info('metadata: ', metadata);

      if (!spine || spine.length === 0) {
        throw new ProcessingError('No chapters found in EPUB');
      }

      logger.info('Processing EPUB with spine length:', spine.length);

      const chapters: Chapter[] = [];
      const maxChapters = Math.min(spine.length, 5);

      for (let i = 0; i < maxChapters; i++) {
        const item = spine.get(i);
        if (!item) {
          logger.warn(`No spine item found at index ${i}`);
          continue;
        }

        try {
          logger.info(`Processing chapter ${i + 1}/${maxChapters}`);

          const content = await item.load();
          if (!content) {
            logger.warn(`No content loaded for chapter ${i + 1}`);
            continue;
          }

          const { title, content: extractedContent } =
            extractChapterContent(content);

          chapters.push({
            id: item.idref || String(i + 1),
            title: title || `Chapter ${i + 1}`,
            content: extractedContent,
            summary: '',
            status: 'pending',
          });

          logger.info(`Successfully processed chapter ${i + 1}`);
        } catch (error) {
          logger.error(`Error processing chapter ${i}:`, error);
          // Continue with next chapter
          continue;
        }
      }

      if (chapters.length === 0) {
        throw new ProcessingError('No valid chapters found in EPUB');
      }

      logger.info(`Successfully extracted ${chapters.length} chapters`);

      return {
        chapters,
        title: metadata?.title || 'Untitled Book',
      };
    } catch (error) {
      logger.error('Error parsing EPUB content:', error);
      throw error instanceof ProcessingError
        ? error
        : new ProcessingError('Failed to parse EPUB content');
    }
  }

  static async processEpubFile(
    file: File,
    signal?: AbortSignal
  ): Promise<{ book: Book; cleanup: () => Promise<void> }> {
    let filePath: string | undefined;

    try {
      logger.info('Starting EPUB file processing');

      // Upload to Supabase
      filePath = await StorageService.uploadFile(file);
      logger.info('File uploaded to Supabase');

      // Download for processing
      const arrayBuffer = await StorageService.downloadFile(filePath);
      logger.info('File downloaded from Supabase');

      // Parse EPUB content
      const { chapters, title } = await this.parseEpubContent(arrayBuffer);

      const cleanup = async () => {
        if (filePath) {
          await StorageService.deleteFile(filePath).catch((error) => {
            logger.error('Error cleaning up file:', error);
          });
        }
      };

      return {
        book: { title, chapters },
        cleanup,
      };
    } catch (error) {
      // Clean up on error
      if (filePath) {
        await StorageService.deleteFile(filePath).catch((error) => {
          logger.error('Error cleaning up file:', error);
        });
      }

      logger.error('Error processing EPUB:', error);
      throw error instanceof ProcessingError
        ? error
        : new ProcessingError('Failed to process EPUB file');
    }
  }
}

export const extractChapterContent = (rawContent: string): ExtractedContent => {
  try {
    if (!rawContent?.trim()) {
      throw new ProcessingError('Empty raw content provided');
    }

    const parser = new DOMParser();
    const doc = parser.parseFromString(rawContent, 'text/html');

    // Check for parsing errors
    const parserError = doc.querySelector('parsererror');
    if (parserError) {
      throw new ProcessingError('Failed to parse HTML content');
    }

    const title = findTitle(doc);
    const content = findContent(doc);
    const cleanedContent = cleanContent(content);

    validateContent(cleanedContent);

    return {
      title: title || 'Untitled Chapter',
      content: cleanedContent,
    };
  } catch (error) {
    logger.error('Error extracting chapter content:', error);
    throw error instanceof ProcessingError
      ? error
      : new ProcessingError('Failed to extract chapter content');
  }
};

Answer 1

最后，当您指定书中的章节时，看起来就像在 book 上调用 load() 一样，如下所示： const content = wait book.load(item.href);

希望对某人有帮助。

使用 epubjs 或 epub-parser 读取 epub 总是返回空

问题描述投票：0回答：1

1个回答

最新问题

使用 epubjs 或 epub-parser 读取 epub 总是返回空

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1