根据第一列中的数据合并两个 CSV 文件

Question

我有两个如下所示的 csv 文件，我想合并它们 - 或多或少使用第一列 ID_ 作为唯一标识符，并将 AMT 列附加到最终文件中的新列。

CSV1

ID_ CUSTOMER_ID_    EMAIL_ADDRESS_
1090    1   [email protected]
1106    2   [email protected]
1145    3   [email protected]
1206    4   [email protected]
1247    5   [email protected]
1254    6   [email protected]
1260    7   [email protected]
1361    8   [email protected]
1376    9   [email protected]

CSV2

这是我在最终文件中寻找的内容：

ID_ CUSTOMER_ID_    EMAIL_ADDRESS_  AMT
1090    1   [email protected]    5
1106    2   [email protected]    5
1145    3   [email protected]    5
1206    4   [email protected]    5
1247    5   [email protected]    5
1254    6   [email protected]    65
1260    7   [email protected]    5
1361    8   [email protected]    10
1376    9   [email protected]    5

我尝试尽可能多地修改下面的这个，但无法得到我正在寻找的东西。真的坚持这个 - 不知道我还能做什么。非常感谢任何和所有的帮助！

join -t, File1.csv File2.csv

此示例中显示的数据包含选项卡，但我的实际文件是如上所述的 CSV，并将包含逗号作为分隔符。

Answer 1

这可以使用 Pandas 库轻松完成。这是我执行此操作的代码：

'''
This program reads two csv files and merges them based on a common key column.
'''
# import the pandas library
# you can install using the following command: pip install pandas

import pandas as pd

# Read the files into two dataframes.
df1 = pd.read_csv('CSV1.csv')
df2 = pd.read_csv('CSV2.csv')

# Merge the two dataframes, using _ID column as key
df3 = pd.merge(df1, df2, on = 'ID_')
df3.set_index('ID_', inplace = True)

# Write it to a new CSV file
df3.to_csv('CSV3.csv')

您可以在这里找到有关 pandas 的简短教程： https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html

根据第一列中的数据合并两个 CSV 文件

问题描述投票：0回答：1

1个回答

最新问题

根据第一列中的数据合并两个 CSV 文件

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1