我正在编写一个脚本,通过 RSS feed 在 SEC Edgar 上查找一些表格。 RSS feed 链接没问题,但是当我尝试 print_r 它时,用 simplexml_load_file() 解析没有显示任何对象
错误代码:
// Parse the RSS feed
$feed = simplexml_load_file($rss_feed_url);`
:-(
我已经为此撕扯了我的头发好几个小时了 这是我的 php 完整代码
<?php
function get_sec_filings_with_phrase($phrase, $days_back = 2) {
// Define the RSS feed URL
$rss_feed_url = "https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&type=8-K&count=100&output=atom";
// Calculate date range
$end_date = new DateTime();
$start_date = (new DateTime())->sub(new DateInterval('P' . $days_back . 'D'))->setTime(0, 0);
// Parse the RSS feed
$feed = simplexml_load_file($rss_feed_url);
print_r($feed); //
$filings = [];
// Iterate over entries in the feed
foreach ($feed->entry as $entry) {
$entry_date = new DateTime($entry->published);
// Check if the entry is within the desired date range
if ($entry_date >= $start_date && $entry_date <= $end_date) {
// Fetch the content of the filing
$filing_content = $entry->summary;
// Check if the phrase is present in the filing content
if (stripos($filing_content, $phrase) !== false) {
$filings[] = [
"title" => (string)$entry->title,
"link" => (string)$entry->link['href'],
"date" => $entry_date->format("Y-m-d H:i:s")
];
}
}
}
return $filings;
}
// Keyword
$phrase = "bank";
$filings = get_sec_filings_with_phrase($phrase);
// show results as HTML
if (!empty($filings)) {
echo "<table border='1'>";
echo "<tr><th>Title</th><th>Date</th><th>Link</th></tr>";
foreach ($filings as $filing) {
echo "<tr>";
echo "<td>".$filing['title']."</td>";
echo "<td>".$filing['date']."</td>";
echo "<td><a href='".$filing['link']."'>".$filing['link']."</a></td>";
echo "</tr>";
}
echo "</table>";
} else {
echo "No filings found in the last 48 hours containing the keyword '". $phrase. "'.";
}
?>
RSS 提要示例:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Latest Filings - Thu, 02 May 2024 15:27:03 EDT</title>
<link rel="alternate" href="/cgi-bin/browse-edgar?action=getcurrent"/>
<link rel="self" href="/cgi-bin/browse-edgar?action=getcurrent"/>
<id>https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent</id>
<author><name>Webmaster</name><email>[email protected]</email></author>
<updated>2024-05-02T15:27:03-04:00</updated>
<entry>
<title>8-K - KKR FS Income Trust (0001930679) (Filer)</title>
<link rel="alternate" type="text/html" href="https://www.sec.gov/Archives/edgar/data/1930679/000110465924056303/0001104659-24-056303-index.htm"/>
<summary type="html">
<b>Filed:</b> 2024-05-02 <b>AccNo:</b> 0001104659-24-056303 <b>Size:</b> 193 KB
<br>Item 8.01: Other Events
<br>Item 9.01: Financial Statements and Exhibits
</summary>
<updated>2024-05-02T15:19:50-04:00</updated>
<category scheme="https://www.sec.gov/" label="form type" term="8-K"/>
<id>urn:tag:sec.gov,2008:accession-number=0001104659-24-056303</id>
</entry>
<entry>
<title>8-K - Catalyst Bancorp, Inc. (0001849867) (Filer)</title>
<link rel="alternate" type="text/html" href="https://www.sec.gov/Archives/edgar/data/1849867/000184986724000015/0001849867-24-000015-index.htm"/>
<summary type="html">
<b>Filed:</b> 2024-05-02 <b>AccNo:</b> 0001849867-24-000015 <b>Size:</b> 1 MB
<br>Item 2.02: Results of Operations and Financial Condition
<br>Item 5.02: Departure of Directors or Certain Officers; Election of Directors; Appointment of Certain Officers: Compensatory Arrangements of Certain Officers
<br>Item 7.01: Regulation FD Disclosure
<br>Item 9.01: Financial Statements and Exhibits
</summary>
<updated>2024-05-02T15:19:06-04:00</updated>
<category scheme="https://www.sec.gov/" label="form type" term="8-K"/>
<id>urn:tag:sec.gov,2008:accession-number=0001849867-24-000015</id>
</entry>
找不到为什么 simplexml_load_file() 没有返回对象。有线索吗?
他们有办法检测自动化工具。但如果你想假装是一个浏览器,你可以设置 UA 并使用 cURL 库。
<?php
// Define the RSS feed URL
$rss_feed_url = "https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&type=8-K&count=10&output=atom";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $rss_feed_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'My Custom User-Agent/1.0');
$xml = curl_exec($ch);
curl_close($ch);
$feed = simplexml_load_string($xml);
print_r($feed);