将nodejs中的spawn进程执行时间从秒减少到毫秒

Question

我想在 Node js 应用程序中执行 python 代码，所以我在下面使用了

const { spawn } = require('node:child_process');
const process = spawn('node', ['-c', "python code content will come here"]);
process.stdout.on('data', (data) => {
  console.log(`stdout: ${data}`);
});

它按预期工作，但需要大约几秒钟才能在 stdout 中打印输出，一些Python脚本返回大量数据，所以我使用了process.stdout

现在我主要关心的是时间，如何减少这个时间，应该以毫秒为单位，但需要几秒钟来执行和打印输出。

Answer 1

以下是从 Node.js 应用程序运行 Python 脚本时提高性能的一些技巧：

使用Python脚本而不是内联代码：您应该将其保存为
```
spawn command
```
文件并使用Python解释器运行该文件，而不是内联传递Python代码（即在
```
.py
```
中）。这样，Python 就可以使用其内置的优化来加载和运行脚本，而这在运行内联代码时不可用。
使用更快的Python解释器：如果您使用标准CPython，您可以尝试像PyPy这样更快的解释器，由于其即时（JIT）编译器，它通常可以更快地运行Python代码。您可以进一步完善这里
并行化您的 Python 代码：如果您的 Python 代码受 CPU 限制，并且可以将工作划分为独立的任务，则您可以使用 Python 的多处理模块并行运行代码并利用所有 CPU 核心。
使用流：如果 Python 脚本产生大量输出，请考虑使用流在数据到达时对其进行处理，而不是等待整个数据集生成。

Answer 2

我最近遇到了同样的问题 - 我有一个经过训练的模型，它有一个 Python 推理 + Node.js 应用程序，该应用程序使用该模型并充当 API。 python进程识别图像内容。

我意识到这可能与以下原因之一有关：

使用 Node.js 进程生成终端
Python 代码（包括导入）

关于第一 - 您的计算机需要为此进程分配一些内存来加热它。这里的好处是它只需要执行一次，因此您可以启动终端，然后重用标准输出输出。我在这里创建了一个示例要点，但我也会在下面复制它：

import path from 'path';
import express from 'express';
import fileUpload from 'express-fileupload';

import {
  pythonRunningProcess,
  spawnPythonProcess
} from './spawn-python-process';
import {
  useRunningPythonServer
} from './use-running-python-server';

const downloadsFolder = path.resolve(__dirname, './downloads');

let isProcessing = false;
const requestQueue: Array < {
  filePath: string;
  res: any; // Express response
} > = [];

async function processQueue() {
  if (requestQueue.length === 0) {
    isProcessing = false;
    return;
  }

  isProcessing = true;
  const firstEntry = requestQueue.shift();
  if (!firstEntry) {
    return;
  }

  const {
    filePath,
    res
  } = firstEntry;

  try {
    const solution = await useRunningPythonServer(filePath);
    console.log('solution: ', solution);
    res.status(200).send(solution);
    processQueue();
  } catch (error) {
    res.status(500).send(error);
    console.error(error);
    processQueue();
  }
}

async function handleFileUpload(req, res) {
  try {
    if (!pythonRunningProcess) {
      return res.status(500).send('python process is not running');
    }
    if (!req.files || Object.keys(req.files).length === 0) {
      return res.status(400).send('No files were uploaded');
    }
    if (Object.keys(req.files).length > 1) {
      return res.status(400).send('Provide a single file');
    }

    const imageFile = req.files.imageFile;
    const fileName = `image-${Date.now()}.png`;
    const filePath = path.join(downloadsFolder, fileName);
    console.log('filePath: ', filePath);

    // Use the mv() method to place the file somewhere on your server
    imageFile.mv(filePath, async function(err) {
      if (err) {
        return res.status(500).send(err);
      }

      requestQueue.push({
        filePath,
        res
      });

      if (isProcessing) {
        return;
      }

      processQueue();
    });
  } catch (err) {
    console.error(err);
    return res.status(500).send(err);
  }
}

export function runNodeListenerServer() {
  const app = express();
  app.use(
    fileUpload({
      limits: {
        fileSize: 20 * 1024
      }, // 20 kb
    })
  );

  console.log('>>> runNodeListenerServer');
  const PORT = 3456;
  app.listen(PORT, () => {
    console.log(`Server listening on port ${PORT}`);
  });

  // spawn python process  only once simultaneously with the server;
  // reuse it multiple times later
  spawnPythonProcess();

  app.get('/recognise-image', (req, res) => res.status(200).send('python process is running'));
  app.post('/recognise-image', handleFileUpload);
}

runNodeListenerServer();

关于第二点 - Python 是一种解释性语言，因此每次运行 python 脚本时，它都会从顶部开始遍历所有行。就我而言，我决定注释掉整个文件，并暂时在其中

print("someText")

看看运行需要多长时间。调用几乎是即时的。然后我开始逐行取消注释代码，并意识到

import torch

是最慢的，大约需要 2 秒。

您可能还会发现如何对脚本执行的持续时间进行基准测试很有帮助。在 Windows PowerShell 中使用：

Write-Output "$(Get-Date)"; npx ts-node --transpileOnly src/recognise-image.ts; Write-Output "$(Get-Date)"

在 bash 中，使用：

date && npx ts-node --transpileOnly src/recognise-image.ts && date

我的设备规格仅供参考：Win 11 Pro、Ryzen 7 5800H、3.20 GHz、32 GB RAM。在我使用的另一台设备（MacBook Pro M1）上，数字完全不同 - 即使使用 Python 导入，子进程也几乎立即被调用。

祝你好运，将它们组合在一起！

将nodejs中的spawn进程执行时间从秒减少到毫秒

问题描述投票：0回答：2

2个回答

最新问题

将nodejs中的spawn进程执行时间从秒减少到毫秒

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2