我有一个直接从 Excel 中复制的文本文件,格式如下:
Sr No,Employee ID,Name,Department,DOJ,1-Apr,2-Apr,3-Apr,4-Apr,5-Apr...31-Apr
1,ID001,Jason Smith,IT,01-Sep-22,WO,WO,P,P,P....
2,ID002,Pearson Smith,IT,01-Sep-22,WO,P,P,WO,P....
符号(WO,P)是每一天day的考勤记录
我正在尝试从中列出字典。相同的键将是文本文件中的第一行,之后每一行中的连续数据应附加到该键。
它应该看起来像这样:
line2_dictionary = {'Sr No':'1','Employee ID':'ID001','Name':'Jason Simith','Department':'IT','DOJ':'01-Sep-22','1-Apr':'WO','2-Apr':'P','3-Apr':'P'....}
每一行数据将是上述格式的新字典,将组合在一起形成一个列表。
我尝试了字典理解的方法,但我对 python 很陌生,因此需要一些帮助。下面的方法是在每个键中写入相同的值('WO')。
input_data = []
with open(r'file_path','r') as input_file:
for line in input_file:
input_data.append(line.strip('\n'))
attendance_dict = {key: None for key in input_data[0].split(',')}
for line in input_data[1:-1]:
attendance_dict = {k:v for k in attendance_dict for v in line.split(",")}
幸运的是,不需要逐字段将值分配给字典:
import csv
input_file = csv.DictReader(open("attendance.csv"))
for row in input_file:
print(row)
结果是这样的:
{'Sr No': '1', 'Employee ID': 'ID001', 'Name': 'Jason Smith', 'Department': 'IT', 'DOJ': '01-Sep-22', '1-Apr': 'WO', '2-Apr': 'WO', '3-Apr': 'P', '4-Apr': 'P', '5-Apr': 'P'}
{'Sr No': '2', 'Employee ID': 'ID002', 'Name': 'Pearson Smith', 'Department': 'IT', 'DOJ': '01-Sep-22', '1-Apr': 'WO', '2-Apr': 'P', '3-Apr': 'P', '4-Apr': 'WO', '5-Apr': 'P'}
尝试:
# to read the file into a dataframe:
df = pd.read_csv('your_data.csv')
# to put the dataframe into a list-of-dicts:
out = df.to_dict(orient="records")
print(out)
印花:
[
{
"Sr No": 1,
"Employee ID": "ID001",
"Name": "Jason Smith",
"Department": "IT",
"DOJ": "01-Sep-22",
"1-Apr": "WO",
"2-Apr": "WO",
"3-Apr": "P",
"4-Apr": "P",
"5-Apr": "P",
},
{
"Sr No": 2,
"Employee ID": "ID002",
"Name": "Pearson Smith",
"Department": "IT",
"DOJ": "01-Sep-22",
"1-Apr": "WO",
"2-Apr": "P",
"3-Apr": "P",
"4-Apr": "WO",
"5-Apr": "P",
},
]
初始
df
:
Sr No Employee ID Name Department DOJ 1-Apr 2-Apr 3-Apr 4-Apr 5-Apr
0 1 ID001 Jason Smith IT 01-Sep-22 WO WO P P P
1 2 ID002 Pearson Smith IT 01-Sep-22 WO P P WO P