array_diff() 未返回正确的过滤结果

问题描述 投票:0回答:2

我的代码中有2个数组,如下所示:

$kalimat = "I just want to search something like visual odometry, dude";
$kata = array();
$eliminasi = " \n . ,;:-()?!";
$tokenizing = strtok($kalimat, $eliminasi);

while ($tokenizing !== false) {
    $kata[] = $tokenizing;
    $tokenizing = strtok($eliminasi);
}
$sumkata = count($kata);
print "<pre>";
print_r($kata);
print "</pre>";


//stop list
$file = fopen("stoplist.txt","r") or die("fail to open file");
$stoplist;
$i = 0;
while($row = fgets($file)){
    $data = explode(",", $row);
    $stoplist[$i] = $data;
    $i++;
}
fclose($file);
$count = count($stoplist);

//Cange 2 dimention array become 1 dimention
for($i=0;$i<$count;$i++){
    for($j=0; $j<1; $j++){
        $stopword[$i] = $stoplist[$i][$j];
    }
}   

//Filtering process
$hasilfilter = array_diff($kata,$stopword);
var_dump($hasilfilter);

$stopword 包含一些停用词,如 http://xpo6.com/list-of-english-stop-words/

中所附

我想做的是:我想检查是否保存数组

$kata
中存在的元素,并且它不存在于数组
$stopword
中。

所以我想删除数组 $kata 和 $stopword 中存在的所有元素。 我读到了一些使用

array_diff()
的建议,但不知何故它对我不起作用。

php arrays filter array-difference
2个回答
0
投票

array_diff
就是你所需要的,你是对的。这是您尝试执行的操作的简化版本:

<?php

// Your string $kalimat as an array of words, this already works in your example.
$kata = ['I', 'just', 'want', 'to', '...'];

// I can't test $stopword code, because I don't have your file.
// So let's say it's a array with the word 'just'
$stopword = ['just'];

// array_diff gives you what you want 
var_dump(array_diff($kata,$stopword));

// It will display your array minus "just": ['I', 'want', 'to', '...']

您还应该仔细检查

$stopword
的值,我无法测试这部分(没有您的文件)。如果它对你不起作用,我猜问题出在这个变量上(
$stopword
)


0
投票

您的

$stopword
数组有问题。 var_dump 来查看问题。
array_diff
工作正常。

尝试使用我编写的以下代码来使您的

$stopword
数组正确:

 <?php

    $kalimat = "I just want to search something like visual odometry, dude";
    $kata = array();
    $eliminasi = " \n . ,;:-()?!";
    $tokenizing = strtok($kalimat, $eliminasi);

    while ($tokenizing !== false) {
        $kata[] = $tokenizing;
        $tokenizing = strtok($eliminasi);
    }
    $sumkata = count($kata);
    print "<pre>";
    print_r($kata);
    print "</pre>";

    //stop list
    $file = fopen("stoplist.txt","r") or die("fail to open file");
    $stoplist;
    $i = 0;
    while($row = fgets($file)){
        $data = explode(",", $row);
        $stoplist[$i] = $data;
        $i++;
    }
    fclose($file);
    $count = count($stoplist);
    //Cange 2 dimention array become 1 dimention
    $stopword= call_user_func_array('array_merge', $stoplist);
    $new = array();
    foreach($stopword as $st){
        $new[] = explode(' ', $st);
    }
    $new2= call_user_func_array('array_merge', $new);
    foreach($new2 as &$n){
        $n = trim($n);
    }
    $new3 = array_unique($new2);
    unset($stopword,$new,$new2);
    $stopword = $new3;
    unset($new3);

    //Filtering process
    $hasilfilter = array_diff($kata,$stopword);
    print "<pre>";
    var_dump($hasilfilter);
    print "</pre>";
    ?>

希望对你有帮助

© www.soinside.com 2019 - 2024. All rights reserved.