如何从多个URL中提取相同元素并保存到文件中?

问题描述 投票:0回答:1

我有许多URL的列表,我想为其提取特定元素,每个URL中的查询选择器都相同。

例如,在Chrome浏览器中打开了网址“ http://www.nationalregisterofhistoricplaces.com/al/autauga/state.html”然后运行此命令

document.querySelector("body > div:nth-child(7) > div.listheader")

我在Chrome浏览器控制台中获得了所需内容的打印件(如果我应该尝试使用其他工具中的javascript脚本,请告诉我哪个)]

[我是Java语言的新手,我正在尝试,我想从URL_1中提取div元素并将其保存到out.html,然后从URL_2中提取div直到URL_N并追加到先前的div下面,依此类推。

到目前为止,我已经尝试了以下代码,但是它仅在新选项卡中打开第一个URL,并且不打印其div。

var urls = [
"http://www.nationalregisterofhistoricplaces.com/al/autauga/state.html",
"http://www.nationalregisterofhistoricplaces.com/al/barbour/state.html",
"http://www.nationalregisterofhistoricplaces.com/ca/fresno/state.html"
];

for (i = 0; i < urls.length; i++) {
    window.open(urls[i])
    document.querySelector("body > div:nth-child(7) > div.listheader")
};

我的目标是合并每个URL的div并将它们保存在看起来像这样的out.html中:

<!-- div from URL_1 -->

<div class="listheader">
<img src="/nr-images/flag.gif" width="33" height="28">
<div class="listname">Bell House                                                                                                              
<span class="added">(added 1999 -  - #99000150)</span>
</div>
<div class="listaka">Also known as Biggs House</div>
<div class="listaddress">550 Upper Kingston Rd.                                                                                                  , Prattville
</div>
</div>

<!-- div from URL_2 -->

<div class="listheader">
<img src="/nr-images/flag.gif" width="33" height="28">
<div class="listname">Bray-Barron House                                                                                                       
<span class="added">(added 1971 -  - #71000093)</span>
</div>
<div class="listaddress">N. Eufaula Ave.                                                                                                         , Eufaula
</div>
</div>

<!-- div from URL_3 -->

<div class="listheader">
<img src="/nr-images/flag.gif" width="33" height="28">
<div class="listname">Bank of Italy                                                                                                           
<span class="added">(added 1982 -  - #82000963)</span>
</div>
<div class="listaka">Also known as Bank of America on Fulton Mall</div>
<div class="listaddress">1015 Fulton Mall                                                                                                        , Fresno
</div>
</div>

我该怎么做?预先感谢。

我有许多URL的列表,我要为其提取特定元素,每个URL中具有相同的查询选择器。例如,在Chrome中打开URL“ http:// www ....

javascript html extract
1个回答
0
投票
© www.soinside.com 2019 - 2024. All rights reserved.