消息中有多少次姓名？ python 2.x

Question

text.txt的外观如下

8月8日星期五

名字fred @ all你好，你好吗？

阅读乔治·汉娜·艾琳

8月8日星期五

到目前为止，乔治@弗雷德的名字还不错

阅读弗雷德汉娜艾琳

8月8日星期五

名字hannah @fred有点累

阅读弗雷德乔治艾琳

8月9号星期六

名字艾琳@所有你们周末想要做什么

阅读弗雷德乔治汉娜

8月9号星期六

名字弗雷德@艾琳我想去滑冰

乔治艾琳读

8月9号星期六

名字艾琳@弗雷德让我们走了

由弗雷德乔治读

8月9号星期六

名字fred @ ....

等等......有更多的消息

所以我得到了这部分代码

输入

fhand = open('text.txt')

for line in fhand:
    line = line.rstrip()
    if not line.startswith('name ') : continue
    words = line.split()    
    output_name = word[1]
# which will give me just the BOLD names

但是我如何继续并完成代码以便我可以将这些名称的输出加在一起？所以打印将在列表中

期望的输出

['fred', 'george', 'hannah', 'irene', 'fred', 'irene' 'etc..']

保留重复的名称。 append给我一个没有重复名称的列表。

如何获取列表中的所有输出名称？不确定如何定义我生成的输出名称列表。

我的最终目标是查找所有名称并计算text.txt文件中出现的次数。我正在考虑制作一个名单然后计算它们，但我不确定如何创建该列表来计算。我不想把@name只是名字的名字用粗体来计算。每个人发布消息多少次？

期望的最终产出

fred: 3 # or actual number times of occurrence / count
george: 1 # or actual number times of occurrence / count
hannah: 1 # or actual number times of occurrence / count
irene: 2 # or actual number times of occurrence / count

受审

打印列表（output_name）给了我

不需要的输出

[ 'f', 'r', 'e', 'd']

....

这不是我想要的。

预先感谢您的帮助！请原谅我缺乏合适的行话，我仍然是python中的初学程序员。

Answer 1

你可以随时使用list.count，例如：

>>> ['fred', 'george', 'hannah', 'irene', 'fred', 'irene'].count('fred')
2

或者，在迭代时构建字典：

counter = {}
for line in fhand:
    line = line.rstrip()
    if not line.startswith('name ') : continue
    words = line.split()    
    output_name = word[1]
    try:
        counter[output_name] += 1
    except KeyError:
        counter[output_name] = 1

或者，使用内置的Counter：

>>> from collections import Counter
>>> Counter(['fred', 'george', 'hannah', 'irene', 'fred', 'irene'])
Counter({'fred': 2, 'irene': 2, 'george': 1, 'hannah': 1})

最后，从dict打印：

for name, count in counter.items():
    print("{}: {}".format(name, count)

Answer 2

使用词典从列表中计算您的项目

fhand = open('text.txt')
names=[]
for line in fhand:
    line = line.rstrip()
    if not line.startswith('name ') : continue
    words = line.split()    
    output_name = words[1]
    names.append(output_name)
# which will give me just the BOLD names 
L = ['apple','red','apple','red','red','pear']
allcount = {}
[allcount .__setitem__(item,1+allcount.get(item,0)) for item in names]
print(allcount )

Answer 3

你也可以使用regex来做到这一点：

import re
from collections import Counter

with open('text.txt', 'r') as f:
    data = f.read()

results = Counter(re.findall('(\w+) @', data))
for name, value in results.items():
    print('{}: {}'.format(name, value))

输出：

fred: 2
george: 1
hannah: 1
irene: 2

Answer 4

您需要将计数添加到列表：

fhand = open('text.txt')
names = [] # an empty list to hold the names
for line in fhand:
    line = line.rstrip()
    if not line.startswith('name ') : continue
    words = line.split()    
    names.append(word[1])

现在names列表中有名字。要计算频率，您可以执行以下操作：

import collections
freq = collections.Counter(names)

现在freq将是一个Counter对象，类似于字典，它将包含每个名称的出现次数。例如，freq['fred']将返回名称“fred”的出现次数。

作为旁注，我建议尽可能不使用continue - 它使代码不太清楚。而不是那样，使用if ... else：

fhand = open('text.txt')
names = [] # an empty list to hold the names
for line in fhand:
    line = line.rstrip()
    if line.startswith('name '):
        words = line.split()
        names.append(word[1])

通过这种方式，您的代码可以使您的意图（“提取名称”）更加清晰。

如果您现在想要使用频率结果执行某些操作（即打印），则可以查看字典：

for k, v in freq.items():
    print(k, v)

（当然你可以使用print来更好地格式化结果。）

Answer 5

您应该尝试创建字典并存储名称及其出现次数。

    from collections import defaultdict
    fhand = open('text.txt')

    name_count = defaultdict(int)    
    for line in fhand:
        line = line.rstrip()
        if not line.startswith('name ') : continue
        words = line.split()    
#        output_name = word[1]
        name_count[words[1]] += 1

    print(name_count)

消息中有多少次姓名？ python 2.x

问题描述投票：0回答：5

5个回答

最新问题

消息中有多少次姓名？ python 2.x

问题描述 投票：0回答：5

5个回答

最新问题

问题描述投票：0回答：5