我如何使用 Incapsule 下载受保护的 XML?,我在 ubuntu 服务器 22 上使用 cron

问题描述 投票:0回答:1

我需要在 ubuntu 服务器 22 中使用 cron 下载此 xml https://www.sbs.gob.pe/app/xmltipocambio/TC_TI_Portal_xml.xml

邮递员

我尝试使用带有 cookies 和 headers 的 PHP

<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://www.sbs.gob.pe/app/xmltipocambio/TC_TI_Portal_xml.xml',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36',
    'Acept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Accept-Language: es-ES,es;q=0.9,en;q=0.8',
    'Connection: keep-alive',
    'Cookie: incap_ses_1619_2355492=YY1MbHjuL29nrYNZGNh3FgwXMGcAAAAAKiPwXriZvD1CjH20JCrT4Q==; visid_incap_2355492=JXwUES73Q0iJUC7yP8fz7yqML2cAAAAAQUIPAAAAAACnDL7+W+pMrttP8iSVofb9; TS01fc2e41=019955ae164e95aa0ff72e02668fc97902a191a5240bc7a84d3b8766d1ca6464a1f5743a921ca51bef417b65cb4ad5de3b1ce9ec19'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

但回应是:

<html style="height:100%">

<head>
    <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
    <meta name="format-detection" content="telephone=no">
    <meta name="viewport" content="initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
</head>

<body style="margin:0px;height:100%"><iframe id="main-iframe"
        src="/_Incapsula_Resource?CWUDNSAI=27&xinfo=9-25703392-0%200NNN%20RT%281731205587990%2095%29%20q%280%20-1%20-1%20-1%29%20r%280%20-1%29&incident_id=0-134309180844737737&edet=9&cinfo=ffffffff&rpinfo=0&mth=GET"
        frameborder=0 width="100%" height="100%" marginheight="0px" marginwidth="0px">Request unsuccessful. Incapsula
        incident ID: 0-134309180844737737</iframe></body>

</html>

回复

正确答案一定是:

<?xml version="1.0" encoding="utf-8"?>
<tipocambio>
    <linktc>http://www.sbs.gob.pe/principal/categoria/tipo-de-cambio/147/c-147</linktc>
    <linktilegal>http://www.sbs.gob.pe/principal/categoria/tasa-de-interes-legal/155/c-155</linktilegal>
    <linktipromedio>http://www.sbs.gob.pe/principal/categoria/tasa-de-interes-promedio/154/c-154</linktipromedio>
    <fecha>08/11/2024</fecha>
    <moneda>$</moneda>
    <compra>3.764</compra>
    <venta>3.769</venta>
</tipocambio>
php xml cron postman incapsula
1个回答
0
投票

嘿,我检查了这个问题,我必须特别对 Incapsula 进行一些研究,您遇到的问题可能是由于他们的网络安全服务而发生的。您的脚本可能会被阻止,因为他们的网络安全已将您的脚本标记为威胁。

因此,您可以尝试在脚本中添加用户代理标头:

<?php
$url = 'https://www.sbs.gob.pe/app/xmltipocambio/TC_TI_Portal_xml.xml';
$userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36';

$ch = curl_init();
curl_setopt($ch,   
 CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT,   
 $userAgent);
$output = curl_exec($ch);
curl_close($ch);

file_put_contents('downloaded_xml.xml', $output);

echo 'Downloaded XML saved to downloaded_xml.xml';
?>

将此脚本保存为 download_xml.php 并使用 php download_xml.php 从命令行运行它,并在您的 cron 作业中直接调用 php download_xml.php

© www.soinside.com 2019 - 2024. All rights reserved.