PHP CURL / XPATH - 链接不起作用

问题描述 投票:0回答:1

我正在使用以下代码来抓取一些外部div http://psnc.org.uk/our-latest-news-category/psnc-news/

我想抓取 PSNC 新闻最新新闻部分

$ch = curl_init("http://psnc.org.uk/our-latest-news-category/psnc-news/");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
curl_close($ch);

$document = new DOMDocument;
libxml_use_internal_errors(true);
$document->loadHTML($output);
$xpath = new DOMXPath($document);

$tweets = $xpath->query("//article[@class='news-template-box']");

echo "<html><body>";
foreach ($tweets as $tweet) {
echo "\n<p>".$tweet->nodeValue."</article>\n";
}
echo "</html></body>";

它成功地抓取了文本,但链接/href's/图像实际上所有元素都没有出现。

我错过了什么吗?

php web-scraping curl xpath domdocument
1个回答
1
投票

DOMNode::nodeValue == DOMNode::textContent,仅打印文本内容。

http://php.net/manual/en/class.domnode.php#domnode.props.nodevalue

$tweets = $xpath->query("//article[@class='news-template-box']");

foreach ($tweets as $tweet) {
    echo $document->saveHTML($tweet);
}
© www.soinside.com 2019 - 2024. All rights reserved.