如何使用cheerio.js从文档中删除

问题描述 投票:0回答:1

我正在尝试从cherio.js解析的html文档中删除<!DOCTYPE html><?xml ...>。有可能吗?

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html>
  <head></head>
  <body>
    <div>text</div>
  </body>
</html>
javascript node.js web-scraping cheerio
1个回答
1
投票

你可以简单地提取html。您需要做的就是再次添加html标签

const cheerio = require('cheerio');

const html = `
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html>
  <head></head>
  <body>
    <div>text</div>
  </body>
</html>
`;
const $ = cheerio.load(html);
console.log($('html').html());
© www.soinside.com 2019 - 2024. All rights reserved.