从另一个文件创建一个文件，但列数较少

Question

我有一个文件，我需要从中创建一个新文件，该文件仅包含两列用于映射目的。我不能使用 pandas，因为我所在的地方不允许使用 pandas，而且我正在努力解决如何将这两个字段放入新文件中。到目前为止我的代码如下，我知道迭代器不完整，我不确定如何将缓冲区限制为这两个字段？我真的很感激任何人可以提供的任何建议。非常感谢：

    # Read CSV from S3
response = s3_client.get_object(Bucket=bucket_name, Key=input_key)
input_data = response['Body'].read().decode('utf-8')


# Process the CSV
reader = csv.DictReader(StringIO(input_data))
rows = list(reader)  

# Read all rows into memory
fieldnames = reader.fieldnames  # Get the existing column names

# just get the two column values for the new file
for row in rows:
  row["file_token"] 
  row["Payment reference"]
  

  
# Write the new CSV data  to a string buffer
output_buffer = StringIO()
writer = csv.DictWriter(output_buffer, fieldnames=fieldnames)
writer.writeheader()  # Write the updated header
writer.writerows(rows)  # Write the updated rows
output_buffer.seek(0)  # Reset buffer pointer to the beginning


s3_client.put_object(Bucket=bucket_name, Key=output_key, Body=output_buffer.getvalue())
print(f"Updated file saved to s3://{bucket_name}/{output_key}")

Answer 1

由于您已经读取了 csv 并转换为行列表，因此以下函数可以满足您的需求。

def retrieve_columns_from_row_dict(row, column_names):
    return {name: row[name] for name in column_names if name in row}

def retrieve_columns_from_csv_dict(csv_rows, column_names):
    return list(map(lambda row: retrieve_columns_from_row_dict(row, column_names), csv_rows))

所以你需要做的就是像这样调用函数

filtered_rows = retrieve_columns_from_csv_dict(rows, fieldnames)

从另一个文件创建一个文件，但列数较少

问题描述投票：0回答：1

1个回答

最新问题

从另一个文件创建一个文件，但列数较少

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1