我正在尝试从 canada.ca 的网页上乱写表格内容。看来检索到的内容缺少某些部分,如以下代码所示:
<?php
$url="https://www.canada.ca/en/immigration-refugees-citizenship/corporate/mandate/policies-operational-instructions-agreements/ministerial-instructions/express-entry-rounds.html";
$base = $url;
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36');
$str = curl_exec($curl);
curl_close($curl);
echo $str; // from what shown, we can decide table header ("# Date Round type ...") exists, but other table rows are missing
“table/tbody”中的“tr”似乎丢失了。它必须是 CURL 参数中的某些内容。可能是什么?
您可以使用 .json url 导入数据:
https://www.canada.ca/content/dam/ircc/documents/json/ee_rounds_123_en.json