Lucene 建议(SUGGEST_MORE_POPULAR 标志的行为)

问题描述 投票:0回答:1

我想使用 Lucene 建议机制来帮助最终用户找出他何时犯了拼写错误。

Lucene 的

SpellChecker
有一个方法
suggestSimilar
应该接收 SuggestionMode 标志。使用标志
SuggestMode.SUGGEST_MORE_POPULAR
,我希望只提供当前目录中更多存在的单词的建议。

下面的代码似乎不符合这个假设:

import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.spell.LuceneDictionary;
import org.apache.lucene.search.spell.SpellChecker;
import org.apache.lucene.search.spell.SuggestMode;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

import java.io.IOException;
import java.util.LinkedList;
import java.util.List;

public class SuggestTest {

    static public void main(String args[]) throws IOException {

        final String NAME_FIELD = "NAME";

        Directory directory = new RAMDirectory();
        IndexWriter writer = new IndexWriter(directory,
                new IndexWriterConfig(new SimpleAnalyzer()));
        writer.deleteAll();
        writer.commit();

        List<String> list = new LinkedList<>();

        for (int i = 0; i < 1000; i++)
            list.add("wafa");

        list.add("waffa");

        for (String name : list) {
            Document doc = new Document();
            doc.add(new TextField(NAME_FIELD, name, Field.Store.YES));
            writer.addDocument(doc);
        }

        writer.close();
        DirectoryReader directoryReader = DirectoryReader.open(directory);


        LuceneDictionary nameDictionary = new LuceneDictionary(directoryReader, NAME_FIELD);

        IndexWriterConfig config = new IndexWriterConfig(new SimpleAnalyzer());

        SpellChecker spellChecker = new SpellChecker(directory);
        spellChecker.indexDictionary(nameDictionary, config, true);

        for (String s : new String[]{"wafa", "waffa", "wala"}) {
            String suggestions[] = spellChecker.suggestSimilar(s, 10, null, null, SuggestMode.SUGGEST_MORE_POPULAR);
            System.out.println("Suggestions for " + s);
            for (String suggestion : suggestions)
                System.out.println(" -" + suggestion);
        }
    }
}

当我寻找

Waffa
(目录中出现了 1000 次!)
 时,我不希望以下代码会建议我 
Wafa

java lucene
1个回答
0
投票

您想要调整代码以使用 SUGGEST_MORE_POPULAR 模式。

String suggestions[] = spellChecker.suggestSimilar(s, 10, directoryReader, NAME_FIELD, SuggestMode.SUGGEST_MORE_POPULAR);
© www.soinside.com 2019 - 2024. All rights reserved.