计算字符串中整个单词的出现次数

问题描述 投票:1回答:7

我想找到字符串中特定单词的出现次数。

我在网上搜索过,发现很多答案

但他们都没有给我准确的结果。

我想要的是:

输入:

I have asked the question in StackOverflow. Therefore i can expect answer here.

“The”关键字的输出:

The keyword count: 2

注意:不应该在句子中考虑“因此”中的“The”。

基本上我想匹配整个单词并得到计数。

c# asp.net
7个回答
4
投票

试试这样吧

var searchText=" the ";
var input="I have asked the question in StackOverflow. Therefore i can expect answer here.";
var arr=input.Split(new char[]{' ','.'});
var count=Array.FindAll(arr, s => s.Equals(searchText.Trim())).Length;
Console.WriteLine(count);

DOTNETFIDDLE

编辑

为您的搜索句子

var sentence ="I have asked the question in StackOverflow. Therefore i can expect answer here.";
var searchText="have asked";
char [] split=new char[]{',',' ','.'};
var splitSentence=sentence.ToLower().Split(split);
var splitText=searchText.ToLower().Split(split);
Console.WriteLine("Search Sentence {0}",splitSentence.Length);
Console.WriteLine("Search Text {0}",splitText.Length);
var count=0;
for(var i=0;i<splitSentence.Length;i++){
    if(splitSentence[i]==splitText[0]){
      var index=i;
        var found=true;
        var j=0;
        for( j=0;j<splitText.Length;j++){
          if(splitSentence[index++]!=splitText[j])
          {
              found=false;
              break;
          }
        }
        if(found){
            Console.WriteLine("Index J {0} ",j);
            count++;
            i= index >i ? index-1 : i;
        }
    }

}
Console.WriteLine("Total found {0} substring",count);

DOTNETFIDDLE


2
投票

可能的解决方案是使用Regex:

var count = Regex.Matches(input.ToLower(), String.Format("\b{0}\b", "the")).Count;

0
投票

您可以使用while循环搜索第一次出现的索引,然后从找到的索引++位置进行搜索并在循环结束时设置一个计数器。循环变为直到索引== -1。


0
投票

那么问题不是你想的那么简单;应该注意许多问题,例如标点符号,字母大小写以及如何识别字边界等问题。但是,使用N_Gram概念我提供以下解决方案:

1-确定密钥中有多少个单词。将其命名为N.

2-提取文本中所有N个连续的单词序列(N_Grams)。

3-计算N_Grams中键的出现次数

    string text = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
    string key = "the question";
    int gram = key.Split(' ').Count();
    var parts = text.Split(' ');
    List<string> n_grams = new List<string>();
    for (int i = 0; i < parts.Count(); i++)
    {
        if (i <= parts.Count() - gram)
        {
            string sequence = "";
            for (int j = 0; j < gram; j++)
            {
                sequence += parts[i + j] + " ";
            }
            if (sequence.Length > 0)
                sequence = sequence.Remove(sequenc.Count() - 1, 1);
            n_grams.Add(sequence);
        }
    }

    // The result
    int count = n_grams.Count(p => p == key);

}

例如,对于key = the question并将single space视为单词边界,将提取以下bi-gram:

我有 问过了 问道 这个问题 问题 在StackOverflow中。 堆栈溢出。因此 因此我 我可以 可以期待 期待答案 回答这里。

并且the question出现在文本中的次数并不明显:1


0
投票

此解决方案应该在字符串所在的任何位置都能

var str = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
var numMatches = Regex.Matches(str.ToUpper(), "THE")
    .Cast<Match>()
    .Count(match => 
        (match.Index == 0 || str[match.Index - 1] == ' ') && 
        (match.Index + match.Length == str.Length || 
            !Regex.IsMatch(
                str[match.Index + match.Length].ToString(),
                "[a-zA-Z]")));

.NET Fiddle


0
投票
string input = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
string pattern = @"\bthe\b";
var matches = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
Console.WriteLine(matches.Count);

见Regex Anchors - “\ b”。


0
投票

试试这样吧

string Text = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
                Text = Text.ToLower();
                Dictionary<string, int> frequencies = null;
                frequencies = new Dictionary<string, int>();
                string[] words = Regex.Split(Text, "\\W+");
                foreach (string word in words)
                {
                    if (frequencies.ContainsKey(word))
                    {
                        frequencies[word] += 1;
                    }
                    else
                    {
                        frequencies[word] = 1;
                    }
                }


                foreach (KeyValuePair<string, int> entry in frequencies)
                {
                    string word = entry.Key;
                    int frequency = entry.Value;
                    Response.Write(word.ToString() + "," + frequency.ToString()+"</br>");
                }

要搜索特定的单词然后尝试像这样。

string Text = "I have asked the question in StackOverflow. Therefore the i can expect answer here.";
        Text = Text.ToLower();
        string searchtext = "the";
        searchtext = searchtext.ToLower();
        string[] words = Regex.Split(Text, "\\W+");
        foreach (string word in words)
        {
            if (searchtext.Equals(word))
            {
                count = count + 1;
            }
            else
            {
            }
        }
        Response.Write(count);
© www.soinside.com 2019 - 2024. All rights reserved.