403 通过 PHP 下载文件但不是从浏览器下载时

问题描述 投票:0回答:1

通过 PHP 从 URL 下载 JSON 文件时出现 403 错误。我可以毫无问题地从浏览器打开文件(开发工具中没有错误)。

这是脚本(减去实际的 URL),我已经验证它可以与其他网站一起使用:

$ch = curl_init($url);
$dir = '../sources/';
$file_name = basename($url);
$save_file_loc = $dir . $file_name;
$fp = fopen($save_file_loc, 'wb');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
// curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_CAINFO, 'C:/Program \Files/php/cacert.pem');
curl_exec($ch);
echo curl_errno($ch)."\n";
print_r(curl_getinfo($ch));
curl_close($ch);
fclose($fp);

证书来自

https://curl.se/docs/caextract.html
。这是我得到的信息:

    [url] => the/url/path/to/json/file
    [content_type] => text/html
    [http_code] => 403
    [header_size] => 183
    [request_size] => 83
    [filetime] => -1
    [ssl_verify_result] => 0
    [redirect_count] => 0
    [total_time] => 0.516273
    [namelookup_time] => 0.065622
    [connect_time] => 0.091481
    [pretransfer_time] => 0.483309
    [size_upload] => 0
    [size_download] => 418
    [speed_download] => 809
    [speed_upload] => 0
    [download_content_length] => 418
    [upload_content_length] => 0
    [starttransfer_time] => 0.516216
    [redirect_time] => 0
    [redirect_url] =>
    [primary_ip] => (the.url.primary.ip.address)
    [certinfo] => Array
        (
        )

    [primary_port] => 443
    [local_ip] => (my.local.ip)
    [local_port] => 65046
    [http_version] => 3
    [protocol] => 2
    [ssl_verifyresult] => 0
    [scheme] => HTTPS
    [appconnect_time_us] => 483121
    [connect_time_us] => 91481
    [namelookup_time_us] => 65622
    [pretransfer_time_us] => 483309
    [redirect_time_us] => 0
    [starttransfer_time_us] => 516216
    [total_time_us] => 516273
    [effective_method] => GET

当然我也尝试过设置用户代理,但没有机会。

我错过了什么?

php curl
1个回答
0
投票

403 禁止。

服务器有问题可能阻止您下载文件。我会联系他们,首先了解您是否可以使用脚本获取文件,以及您如何能够做到这一点。

如果他们的检查特别薄弱,那么这可能会起作用,但如果不是,你会在试图欺骗他们的服务器认为你是合法的时经历很多痛苦:

$url = ''; //url to scrape
$referrer = ''; //sensible url such as the homepage of their site
$dir = '../sources/'; //directory to store file

$curl = curl_init();

$file_name = basename($url);
$save_file_loc = $dir . $file_name;
$fp = fopen($save_file_loc, 'wb');
curl_setopt($curl, CURLOPT_FILE, $fp);

curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0');

$header = array();
$header[] = "Accept: */*";
$header[] = "Connection: keep-alive";

curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, $referrer);
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT ,5);
curl_setopt($curl, CURLOPT_TIMEOUT, 10);

curl_exec($curl);
echo curl_errno($curl)."\n";
print_r(curl_getinfo($curl));
curl_close($curl);
fclose($fp);
© www.soinside.com 2019 - 2024. All rights reserved.