我有多个文本文件。每个文件都是动物及其房屋数量的清单。像这样:
houseA.txt
cats 3
dogs 1
birds 4
houseB.txt
cats 5
dogs 3
birds 1
我有大约20个房子,每个房子有大约16000种(所以每个文件有大约16000条线。所有房子都有相同的种类,每个种类只有不同的数量。
我当前的脚本逐行遍历每个文件,并捕获房屋,硬币名称及其数量。
我想创建一本房子字典,每个房子都是动物及其数量的字典。所以从上面的例子中,结果如下所示:
dictOfDicts{houseA:{'cats': 3, 'dogs': 1, 'birds': 4}, houseB:{'cats': 5, 'dogs': 3, 'birds': 1}}
如果你想知道,这将在以后变成一个表:
house: A B
animal
cats 3 5
dogs 1 3
birds 4 1
这是我的脚本:
#!/usr/bin/python3
import sys
houseL = []
dictList = []
with open(sys.argv[1], 'r') as files:
for f in files:
f = f.rstrip()
with open(f, 'r') as aniCounts:
house = str(aniCounts).split(sep='/')[2] # this and the next line captures the house name from the file name.
house = house.split('.')[0]
houseL.append(house)
for line in aniCounts:
ani = line.split()[0]
count = line.split()[1]
#print(ani, ' ', count)
编辑:由于有用的评论者,将问题改为dicts的dict。
我会尝试这样的事情:
house_names = ['houseA', 'houseB', ...]
houses_dict = {}
for house in house_names:
houses_dict[house] = {}
with open(house + '.txt') as f:
for line in f:
species, num = line.rsplit(maxsplit=1) # split off rightmost word
houses_dict[house][species] = int(num)
结果将是(例如):
houses_dict = {
'houseA': {
'cats': 3
'dogs': 1
'birds': 4
},
'houseB': {
'cats': 5
'dogs': 3
'birds': 1
}
...
}
还有一个版本:
from path import Path
dir_path = '/TEMP'
files_ls = [x for x in Path(dir_path).files() if 'house' in str(x)]
def read_file(path):
lines = dict([row.strip().split(' ') for row in path.open(encoding='utf-8')])
return lines
all_data = dict([(str(x.name),read_file(x)) for x in files_ls])
print(all_data)
输出:
{'house1.txt': {u'birds': u'4', u'cats': u'3', u'dogs': u'1'}}
如果您不想自己分开,请使用csv.DictReader并确保在文件中引用名称中包含空格的动物:
from csv import DictReader
d = {}
files = ["h1.csv","h2.csv"]
for f in files:
with open(f,"r",encoding="utf8",newline="") as houseData:
d[f] = {} # dict per house
for row in DictReader(houseData, fieldnames=["animal","count"], delimiter=' ' ):
d[f][row["animal"]] = int(row["count"]) # access by given fieldnames
print(d)
输出:
{'h1.csv': {'cats': 3, 'dogs': 1, 'birds': 4},
'h2.csv': {'cats': 5, 'dogs': 3, 'birds': 1, 'insects': 2402, 'Blue Flutterwings': 2}}
文件h1.csv
cats 3
dogs 1
birds 4
文件h2.csv
cats 5
dogs 3
birds 1
insects 2402
"Blue Flutterwings" 2
警告:如果你在家里藏有Green Cantilopes
或Blue Flutterwings
,你必须在文件中引用它们 - 这就是这个解决方案开始闪耀的地方 - 因为它会自动处理引用的字符串并与' '
一起作为分隔符。