有没有办法计算使用multer上传的pdf文件的页数?

问题描述 投票:0回答:2

我正在开发一项功能来定义要上传的 pdf 文件的大小限制。这部分没问题,但我还需要定义 pdf 文件的页数限制,但我找不到任何答案。 我正在使用 NestJS,到目前为止我的代码块看起来像这样:

  @UseInterceptors(
FilesInterceptor('files', 10, {
  limits: { fileSize: 10242880 },
  fileFilter: (req, file, callback) => {
    if (!file.originalname.match(/\.(jpg|jpeg|png|pdf)$/)) {
      req.fileValidationError =
        'Invalid file type provided. Valid types: [jpg, jpeg, png, pdf]';
      //TODO: Add validation to check if the file is pdf, and if so, define a limit of pages for the file. (max * pages)
      callback(null, false);
    }

    const fileSize = parseInt(req.headers['content-length']);
    if (fileSize > 10242880) {
      req.fileValidationError = 'File size exceeds the maximum limit [10mb]';
      callback(null, false);
    }
    callback(null, true);
    if (!file.originalname.match(/\.(pdf)$/)) {
      //what to do here??
      
    }
  },
})

界面中可用的乘法器选项如下。也许他们中的任何人都可以向我提供这样的信息,但我真的很感激能提供帮助。 谢谢

export interface MulterOptions {
dest?: string;
/** The storage engine to use for uploaded files. */
storage?: any;
/**
 * An object specifying the size limits of the following optional properties. This object is passed to busboy
 * directly, and the details of properties can be found on https://github.com/mscdex/busboy#busboy-methods
 */
limits?: {
    /** Max field name size (Default: 100 bytes) */
    fieldNameSize?: number;
    /** Max field value size (Default: 1MB) */
    fieldSize?: number;
    /** Max number of non- file fields (Default: Infinity) */
    fields?: number;
    /** For multipart forms, the max file size (in bytes)(Default: Infinity) */
    fileSize?: number;
    /** For multipart forms, the max number of file fields (Default: Infinity) */
    files?: number;
    /** For multipart forms, the max number of parts (fields + files)(Default: Infinity) */
    parts?: number;
    /** For multipart forms, the max number of header key=> value pairs to parse Default: 2000(same as node's http). */
    headerPairs?: number;
};
/** Keep the full path of files instead of just the base name (Default: false) */
preservePath?: boolean;
fileFilter?(req: any, file: {
    /** Field name specified in the form */
    fieldname: string;
    /** Name of the file on the user's computer */
    originalname: string;
    /** Encoding type of the file */
    encoding: string;
    /** Mime type of the file */
    mimetype: string;
    /** Size of the file in bytes */
    size: number;
    /** The folder to which the file has been saved (DiskStorage) */
    destination: string;
    /** The name of the file within the destination (DiskStorage) */
    filename: string;
    /** Location of the uploaded file (DiskStorage) */
    path: string;
    /** A Buffer of the entire file (MemoryStorage) */
    buffer: Buffer;
}, callback: (error: Error | null, acceptFile: boolean) => void): void;

}

node.js typescript nestjs multer
2个回答
2
投票

不,单靠multer 是无法算“不”的。 pdf 中的页面数,因为它只是一个“多部分/表单数据”解析器。

您需要使用 PDF 解析器来数数。像

pdf-lib
这样的页面,请参阅 示例


顺便说一句,我对您的 multer 验证代码有一些建议。

  1. 您应该验证
    mime
    文件类型而不是文件扩展名。
  2. 通过
    content-length
    标头确定文件大小确实是个坏主意,因为它可以由最终用户操纵(使用curl或netcat)。而是使用
    file.size
    参数提供的
    fileFilter
    。由于您已经设置了
    limits.fileSize
    选项,因此在此处验证文件大小变得毫无意义。
const INVALID_PDF_ERROR_MSG = 'Invalid file type provided. Valid types: [jpg, jpeg, png, pdf]'

FilesInterceptor('files', 10, {
  limits: {
    fileSize: 10 * 1024 * 1024, // 10Mb in bytes
  },
  fileFilter: (req, file, callback) => {
    if (file.mime !== 'application/pdf') {
      req.fileValidationError = INVALID_PDF_ERROR_MSG
      callback(null, false) // ```callback(new Error(INVALID_PDF_ERROR_MSG), false)``` is right way to create errors but I leave it you
      return
    }
    // // point less since limits.fileSize is provided
    // if (file.size > 10242880) callback(new Error('File size exceeds the maximum limit [10mb]'), false)

    // I recommend not counting pages here, instead do inside of route handler.
    callback(null, true)
  },
})



0
投票

我能够提供的最佳解决方案是使用 pdf-lib。 我将页面限制检查移至 multer 的 FIleFilter 之外,因为此时文件中没有缓冲区。 正如@bogdanoff

所建议的,该解决方案在不同的函数中使用
try {
  for (const file of files) {
    const isPdf = file.mimetype === 'application/pdf';
    if (isPdf) {
      const loadPdf = PDFDocument.load(file.buffer); 
      //the PDFDocument above, is the pdf-lib imported
      const pdf = await loadPdf;
      const pagesCount = pdf.getPageCount();
      if (pagesCount > 5) {
        throw new BadRequestException({
          statusCode: 400,
          message: 'PDF file must have a maximum of 5 pages.',
        });
      }
    }
  }
} catch (error) {
  this.commonLogger.error({
    error,
    serviceName: this.controllerName,
    method: 'createCredential',
    params: { vendorId, userId, credentialDto },
  });
  throw new BadRequestException({ statusCode: 400, message: error.message });
}

}

© www.soinside.com 2019 - 2024. All rights reserved.