数组中唯一的元素

问题描述 投票:0回答:2

有一个供稿,我从中获取数据,有时其中会出现非常相似的记录。https://dl4.joxi.net/drive/2020/01/17/0028/2950/1842054/54/5abb738180.jpg

我要确保该数组包含最唯一的记录。 (按标题定义)

代码:


$new = array();
$goodFeed = array();

$itemlimit=0;
$itemlimit2=0;


foreach ($feed->get_items() as $item) {
    if ($itemlimit==50) { break; };
    $new[] = strtolower(trim($item->get_title()));
    $itemlimit = $itemlimit + 1;
}

foreach ($feed->get_items() as $item) {
    if ($itemlimit2==50) { break; };
    $itemTitle = strtolower(trim($item->get_title()));

    foreach($new as $item2) {
        similar_text($item2, $itemTitle, $percent);

        if ($percent < 78 && !in_array($item, $goodFeed)) {
                $goodFeed[] = $item;
                echo 'added: ' . $item->get_title() . '<br>Procent: ' . $percent . '<hr>';

        }
    }

    $itemlimit2 = $itemlimit2 + 1;
}

我只希望将唯一值(至少80%)保留在$ goodFeed数组中。现在,它包含彼此非常相似的元素。原始Feed具有名称为:

的元素
1. Metro Redux on Nintendo Switch™ Announce Trailer; 
2. Metro Redux on Nintendo Switch™ Announce Trailer [NA]; 
3. Metro Redux für Nintendo Switch™ Ankündigungs-Trailer [DE]; 
4. Metro Redux on Nintendo Switch™ Announce Trailer [ANZ]; 
5. The Elder Scrolls Online: The Dark Heart of Skyrim Announcement Cinematic;
6. The Elder Scrolls Online - The Dark Heart of Skyrim Cinematic Announcement Trailer

它们都进入$ goodFeed,我只需要这些:

1. Metro Redux on Nintendo Switch™ Announce Trailer
5. The Elder Scrolls Online: The Dark Heart of Skyrim Announcement Cinematic 

谢谢!

php arrays loops similarity
2个回答
0
投票
I have not tested but I think one of these should work for you.

foreach ($feed->get_items() as $item) { 
    if(!strtolower(trim($item->get_title())),$new){
        if ($itemlimit==50) { break; };
        $new[] = strtolower(trim($item->get_title()));
        $goodFeed[] = $item;
        $itemlimit = $itemlimit + 1;
    }
}

-------OR-------

foreach ($feed->get_items() as $item) { 
    if(!strtolower(trim($item->get_title())),$new){
        if(count($new)>0){
            $percent=0;
            foreach($new as $n){
                similar_text($n, strtolower(trim($item->get_title())), $percent);
                if($percent>78){
                    break;
                }
            }
            if($percent>78){
                    continue;
            }

            if ($itemlimit==50) { break; };
            $new[] = strtolower(trim($item->get_title()));
            $goodFeed[] = $item;
            $itemlimit = $itemlimit + 1;
        }
        else{
            $new[] = strtolower(trim($item->get_title()));
            $goodFeed[] = $item;
            $itemlimit = $itemlimit + 1;
        }
    }
}

0
投票

问题是解析器没有传输正确的提要。回收了数组结构,现在可以正常工作了。我也从这里接受了一些想法-Similarity algorithm advice, using two dimensional associative array

[如果有人知道可以将提要合并为一个的良好且仍受支持的RSS解析器(NodeJs,Php),如果您可以链接到它,我将不胜感激。

© www.soinside.com 2019 - 2024. All rights reserved.