如何使用简单的 html DOM 抓取这两个表?

问题描述 投票:0回答:1

我一直在试图弄清楚如何使用 php simple html DOM 来抓取

td class="job"
及其各自的薪水。我可以通过 id 或 class 找到并抓取 div,没有问题,但我不知道如何攻击这样的表。任何帮助将不胜感激!

<table cellpadding="0" cellspacing="0" border="0" class="table01">
<tr>
    <td class="head">Test</td>
    <td class="job">
    <a href="/Illustrator" id="UniqueID1">Illustrator</a><br/>
    $23,729 - $95,429
    </td>
</tr>
<tr>
    <td class="head">Test</td>
    <td class="job">
    <a href="/Small_Business_Owner_%2f_Operator" id="UniqueID2">Small Business Owner / Operator</a><br/>
    $24,369 - $174,991
    </td>
</tr>
<tr>
    <td class="head">Test</td>
    <td class="job">
    <a href="/Waiter%2fWaitress" id="UniqueID3">Waiter/Waitress</a><br/>
    $7,483 - $34,188
    </td>
</tr>
</table>

<table cellpadding="0" cellspacing="0" border="0" class="table02">
<tr>
    <td class="head">Test</td>
    <td class="job" style="padding-right: 20px">
    <a href="/Graphic_Artist_%2f_Designer" id="UniqueID1">Graphic Artist / Designer</a><br/>
    $23,789 - $55,409
    </td>
</tr>
<tr>
    <td class="head">Test</td>
    <td class="job" style="padding-right: 20px">
    <a href="/Illustrator" id="UniqueID2">Illustrator</a><br/>
    $23,729 - $95,429
    </td>
</tr>    
<tr>
    <td class="head">Test</td>
    <td class="job" style="padding-right: 20px">
    <a href="/Art_Director" id="UniqueID3">Art Director</a><br/>
    $34,160 - $85,943
    </td>
</tr>
</table>
php html web-scraping dom
1个回答
2
投票
    $dom = new DOMDocument();
    $html = "your html data";
    // load html
    $dom->loadHTML($html);
    $xpath = new DOMXPath($dom);

    //this will gives you all td with class name is jobs.
    $my_xpath_query = "//table//td[contains(@class, 'job')]";
    $result_rows = $xpath->query($my_xpath_query);

    //iterate all td
    foreach ($result_rows as $result_object){
        echo $result_object->nodeValue;
    }
© www.soinside.com 2019 - 2024. All rights reserved.