搜索文本文件并计算指定字符串的出现次数

问题描述 投票:0回答:3

我正在尝试使用 python 搜索文本文件并计算用户定义的单词出现的次数。但是,当我运行下面的代码时,我得到的不是该唯一单词在文件中出现的次数总和,而是该文件中包含该单词的行数的计数。

示例:单词“bob”在文本文件中出现了 56 次,出现在总共 63 行文本中的 19 行中。当我运行代码时,控制台打印“19”。

我猜我需要用我的分割方法做一些不同的事情?

user_search_value = raw_input("Enter the value or string to search for: ")

count = 0    

with open(file.txt, 'r') as f:
    for word in f.readlines():
        words = word.lower().split()
        if user_search_value in words:
            count += 1
    print(count)
python string-matching
3个回答
0
投票

执行此操作的一种方法是在分割行并为每个匹配单词递增

count
后循环遍历单词:

user_search_value = raw_input("Enter the value or string to search for: ")

count = 0    

with open(file.txt, 'r') as f:
    for line in f.readlines():
        words = line.lower().split()
        for word in words:
            if word == user_search_value:
                count += 1
print count

0
投票

如果“指定字符串”是带空格的短语,则以下是有效的:

#!/usr/bin/python

import sys
import os

def count_words_in_file(filepath, words, action=None):
    with open(filepath) as f:
        data = f.read()
        for key,val in words.items():
            print "key is " + key + "\n"
            ct = data.count(key)
            words[key] = ct
        if action:
             action(filepath, words)


def print_summary(filepath, words):
    print(filepath)
    for key,val in sorted(words.items()):
        print('{0}:\t{1}'.format(
            key,
            val))


filepath = sys.argv[1]
keys = ["Hello how are you",
"Another phrase with spaces",
"A phrase with spaces and some punctuation."]
words = dict.fromkeys(keys,0)

count_words_in_file(filepath, words, action=print_summary)

0
投票

我想通了。我的代码如下。

#read file
f = open(filename, "r")
lines = f.readlines()
f.close()
#looking for patterns
for line in lines:
    line = line.strip().lower().split()
    for words in line:
        if words.find(user_search_value.lower()) != -1:
            count += 1
print("\nYour search value of '%s' appears %s times in this file" % (user_search_value, count))
© www.soinside.com 2019 - 2024. All rights reserved.