如何从文本文件中提取键在第一行和连续垂直对齐值的键值对?

问题描述 投票:0回答:2

我有一个直接从 Excel 中复制的文本文件,格式如下:

Sr No,Employee ID,Name,Department,DOJ,1-Apr,2-Apr,3-Apr,4-Apr,5-Apr...31-Apr
1,ID001,Jason Smith,IT,01-Sep-22,WO,WO,P,P,P....
2,ID002,Pearson Smith,IT,01-Sep-22,WO,P,P,WO,P.... 

符号(WO,P)是每一天day的考勤记录

我正在尝试从中列出字典。相同的键将是文本文件中的第一行,之后每一行中的连续数据应附加到该键。

它应该看起来像这样:

line2_dictionary = {'Sr No':'1','Employee ID':'ID001','Name':'Jason Simith','Department':'IT','DOJ':'01-Sep-22','1-Apr':'WO','2-Apr':'P','3-Apr':'P'....}

每一行数据将是上述格式的新字典,将组合在一起形成一个列表。

我尝试了字典理解的方法,但我对 python 很陌生,因此需要一些帮助。下面的方法是在每个键中写入相同的值('WO')。

input_data = []
with open(r'file_path','r') as input_file:
    for line in input_file:
        input_data.append(line.strip('\n'))
    
attendance_dict = {key: None for key in input_data[0].split(',')}
for line in input_data[1:-1]:
        attendance_dict = {k:v for k in attendance_dict for v in line.split(",")}
python-3.x list dictionary text dictionary-comprehension
2个回答
0
投票

幸运的是,不需要逐字段将值分配给字典:

import csv

input_file = csv.DictReader(open("attendance.csv"))

for row in input_file:
    print(row)

结果是这样的:

{'Sr No': '1', 'Employee ID': 'ID001', 'Name': 'Jason Smith', 'Department': 'IT', 'DOJ': '01-Sep-22', '1-Apr': 'WO', '2-Apr': 'WO', '3-Apr': 'P', '4-Apr': 'P', '5-Apr': 'P'}
{'Sr No': '2', 'Employee ID': 'ID002', 'Name': 'Pearson Smith', 'Department': 'IT', 'DOJ': '01-Sep-22', '1-Apr': 'WO', '2-Apr': 'P', '3-Apr': 'P', '4-Apr': 'WO', '5-Apr': 'P'}

0
投票

尝试:

# to read the file into a dataframe:
df = pd.read_csv('your_data.csv')

# to put the dataframe into a list-of-dicts:
out = df.to_dict(orient="records")
print(out)

印花:

[
    {
        "Sr No": 1,
        "Employee ID": "ID001",
        "Name": "Jason Smith",
        "Department": "IT",
        "DOJ": "01-Sep-22",
        "1-Apr": "WO",
        "2-Apr": "WO",
        "3-Apr": "P",
        "4-Apr": "P",
        "5-Apr": "P",
    },
    {
        "Sr No": 2,
        "Employee ID": "ID002",
        "Name": "Pearson Smith",
        "Department": "IT",
        "DOJ": "01-Sep-22",
        "1-Apr": "WO",
        "2-Apr": "P",
        "3-Apr": "P",
        "4-Apr": "WO",
        "5-Apr": "P",
    },
]

初始

df

   Sr No Employee ID           Name Department        DOJ 1-Apr 2-Apr 3-Apr 4-Apr 5-Apr
0      1       ID001    Jason Smith         IT  01-Sep-22    WO    WO     P     P     P
1      2       ID002  Pearson Smith         IT  01-Sep-22    WO     P     P    WO     P
© www.soinside.com 2019 - 2024. All rights reserved.