我想用 SOLR-Spellchecker 以 Google 实现的方式实现自动完成功能。例如,如果我输入“chocol”,我会得到建议“chocolate”、“chocolissimo”、“chocolate cake”…… 这意味着 SOLR 会在键入的术语中添加多个字符。
这是我的 SOLR 配置:
{
"searchComponent":{
"name": "spellcheckXXX",
"class": "solr.SpellCheckComponent",
"queryAnalyzerFieldType": "text_general",
"spellchecker": {
"name": "default",
"field": "multi_term_lowercase_suggestion",
"classname": "solr.DirectSolrSpellChecker",
"distanceMeasure": "internal",
"maxEdits":1,
"minPrefix":1,
"minQueryLength":3,
"combineWords": "true",
"comparatorClass": "freq"
}
}
{
"requestHandler":{
"name":"/spellcheckXXX",
"class":"solr.SearchHandler",
"startup":"lazy",
"defaults":{
"spellcheck":"true",
"spellcheck.dictionary":"default",
"spellcheck.extendedResults":"true",
"spellcheck.count":"50",
"spellcheck.alternativeTermCount":"2",
"spellcheck.maxResultsForSuggest":"50",
"spellcheck.collate":"true",
"spellcheck.collateExtendedResults":"true",
"spellcheck.maxCollationTries":"100",
"spellcheck.maxCollations":"50",
"spellcheck.onlyMorePopular":"true",
"rows": 0,
"df": "multi_term_lowercase_suggestion"
},
"last-components":["spellcheckXXX"]
}
<field name="multi_term_lowercase_suggestion" type="multi_term_lowercase_suggestion_text" indexed="true" stored="true"/>
<fieldType name="multi_term_lowercase_suggestion_text" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
我的问题是我的配置,我只得到建议,这些建议只添加了一个字符,或者术语中的字符被其他字符替换。并不是真正的预测,更像是拼写纠正。
因此,如果术语是“Schoko”(德语:Schokolade 意思是巧克力),结果是 (.../spellcheckCLN?q=Schokol):
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"response": {
"numFound": 0,
"start": 0,
"numFoundExact": true,
"docs": []
},
"spellcheck": {
"suggestions": [
"schokol",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 7,
"origFreq": 0,
"suggestion": [
{
"word": "school",
"freq": 41
},
{
"word": "schoko",
"freq": 13
}
]
}
],
"correctlySpelled": false,
"collations": [
"collation",
{
"collationQuery": "school",
"hits": 21,
"misspellingsAndCorrections": [
"schokol",
"school"
]
},
"collation",
{
"collationQuery": "schoko",
"hits": 9,
"misspellingsAndCorrections": [
"schokol",
"schoko"
]
}
]
}
}
如果术语是“Schoolad”(.../spellcheckCLN?q=Schoolad):
{
"responseHeader": {
"status": 0,
"QTime": 1
},
"response": {
"numFound": 0,
"start": 0,
"numFoundExact": true,
"docs": []
},
"spellcheck": {
"suggestions": [
"schokolad",
{
"numFound": 1,
"startOffset": 0,
"endOffset": 9,
"origFreq": 0,
"suggestion": [
{
"word": "schokolade",
"freq": 34
}
]
}
],
"correctlySpelled": false,
"collations": [
"collation",
{
"collationQuery": "schokolade",
"hits": 25,
"misspellingsAndCorrections": [
"schokolad",
"schokolade"
]
}
]
}
}
因此存在“Schokolade”的结果,但当该术语短于一个以上字符时则不建议。我必须改变什么?
我找到了问题的答案,所以我将关闭这个问题。