无法获取div中的所有链接

问题描述 投票:0回答:1

我正在尝试从此页面获取所有链接。实际上我几乎用这段代码达到了这个结果:

public function getLinks()
{
    $html = file_get_html("http://it.soccerway.com/national/italy/serie-a/20152016/regular-season/r31554/");

    foreach($html->find("div.block_competition_left_tree-wrapper") as $div)
    {
        foreach ($div->find('a') as $li)
        {
            echo $li->href . "<br>";
        }
    }

}

这就是结果:

/national/italy/serie-a/c13/
/national/italy/serie-a/20152016/s11663/
/national/italy/serie-b/c14/
/national/italy/serie-c1/c53/
/national/italy/serie-c2/c358/
/national/italy/serie-d/c659/
/national/italy/coppa-italia/c135/
/national/italy/super-cup/c171/
/national/italy/coppa-italia-serie-c/c684/
/national/italy/campionato-nazionale-primavera/c952/
/national/italy/coppa-italia-primavera/c1070/
/national/italy/super-coppa-primavera/c1171/
/national/italy/dante-berretti/c1092/
/national/italy/serie-a-women/c293/
/national/italy/serie-a2/c457/
/national/italy/coppa-italia-women/c852/
/national/italy/super-cup-women/c851/
/national/italy/club-friendlies/

问题是我只需要抓取列表中的链接

<li>
,你如何在
html
中看到有不同的类
expanded | odd | even
。本质上,我不想让元素的链接显示为
Serie A - Serie B, etc...
,而是显示其中的链接。特别是这样的结果应该是:

/national/italy/serie-a/20152016/s11663/
/national/italy/serie-b/20152016/regular-season/r31798/
/national/italy/serie-c1/20152016/girone-c/r31861/

现在,如果您在上面的第一个结果中看到,在我的最后一个示例中只有

/national/italy/serie-a/20152016/s11663/
是正确的,这是因为在 html 页面中
Serie A
项目具有类
expanded
并且代码看到了链接。我该如何修复我的代码来实现这一目标?

php html web-scraping simple-html-dom
1个回答
3
投票

我希望我也理解你。您需要像以前一样获取所有链接,然后打开每个链接以获取该类的所有链接。

一个例子:

public function getLinks()
{
    $html = file_get_html("http://it.soccerway.com/national/italy/serie-a/20152016/regular-season/r31554/");

    foreach($html->find("div.block_competition_left_tree-wrapper") as $div)
    {

        //get all links
        foreach ($div->find('a') as $li)
        {
            $openLink = file_get_html("http://it.soccerway.com/".$li->href);

            foreach($openLink->find("div.block_competition_left_tree-wrapper") as $divOfNewLink){

                foreach ($divOfNewLink->find('li') as $liOfNewDiv){

                        if (preg_match("/expanded/i", $liOfNewDiv->class)) {

                            foreach ($liOfNewDiv->find('a') as $link)
                            {
                                echo $link->href . "<br>";
                            }

                        }else{
                             // do nothing
                            }


                }
            }

        }


  }
}
© www.soinside.com 2019 - 2024. All rights reserved.