简单的HTML Dom不要求亚马逊页面

问题描述 投票:-2回答:1

嗨我想要刮到亚马逊产品的价格,但当我通过HTML dom请求页面它显示空白页但但如果我把链接的aliexpress它工作正常

例如 :

$value = "https://www.amazon.com/Apple-iPhone-Plus-Unlocked-32GB/dp/B01N6ZAR0D/"
$html = file_get_html($value);

echo $html;
php dom
1个回答
0
投票

直接通过html dom客户端请求不是sugessted。特别是如果你在像亚马逊这样的大型网站上工作。像亚马逊这样的网站,检查客户端用户代理,cookie和标头信息以验证安全性并检查是否是机器人。

所以,

您应该使用curl或guzzle来请求提供必要请求标头的网页。请求后返回响应字符串并通过str_get_html解析它。

例:

$response = $client->request($url);
$html = str_get_html($response);

您的问题的真实工作示例:单击此link以通过github获取代码

require __DIR__ . '/vendor/autoload.php';
require 'simple_html_dom.php';
use Curl\Curl;

// initialize curl
// you can install via "composer require php-curl-class/php-curl-class"
$curl = new Curl();

// set cookies
$curl->setCookieFile(__DIR__ . '/cookies.txt');
$curl->setCookieJar(__DIR__ . '/cookies.txt');

// decode gzip encoded because amazon is using gzip
$curl->setOpt(CURLOPT_ENCODING , "gzip");

// set request header like a browser
$curl->setHeaders([
    'accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
    'accept-encoding' => 'gzip, deflate, br',
    'accept-language' => 'en,tr;q=0.9',
    'user-agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36',
]);

// request
$curl->get('https://www.amazon.com/Apple-iPhone-Plus-Unlocked-32GB/dp/B01N6ZAR0D/');

// get raw response
$response = $curl->getRawResponse();

// parser
$html = new simple_html_dom();

// load from string html
$html->load($response);

// find price and print
$price = $html->find('#price', 0)->plaintext;
echo $price;
© www.soinside.com 2019 - 2024. All rights reserved.