获取文件创建日期 - 添加到 read_csv 上的数据框列

Question

我需要将许多（数百个）CSV 放入 pandas 数据框中。我需要在读入每个 CSV 文件的 pandas 数据帧时在列中添加文件创建日期。我可以使用此调用获取 CSV 文件的创建日期：

time.strftime('%m/%d/%Y', time.gmtime(os.path.getmtime('/path/file.csv')))

仅供参考，这是我用来读取 CSV 的命令：

path1 = r'/path/'
all_files_standings = glob.glob(path1 + '/*.csv')
standings = pd.concat((pd.read_csv(f, low_memory=False, usecols=[7, 8, 9]) for f in standings))

我尝试运行此调用（有效）：

dt_gm = [time.strftime('%m/%d/%Y', time.gmtime(os.path.getmtime('/path/file.csv')))]

然后我尝试扩展它：

dt_gm = [time.strftime('%m/%d/%Y', time.gmtime(os.path.getmtime(f) for f in all_files_standings))]

我收到此错误：

类型错误：需要一个整数（获取类型生成器）

我该如何解决这个问题？

Answer 1

如果不同的文件具有相同的列，并且您想将不同的文件追加到行中。

import pandas as pd
import time
import os

# lis of files you want to read
files = ['one.csv', 'two.csv']

column_names = ['c_1', 'c_2', 'c_3']

all_dataframes = []
for file_name in files:
    df_temp = pd.read_csv(file_name, delimiter=',', header=None)
    df_temp.columns = column_names
    df_temp['creation_time'] = time.strftime('%m/%d/%Y', time.gmtime(os.path.getmtime(file_name)))
    df_temp['file_name'] = file_name
    all_dataframes.append(df_temp)

df = pd.concat(all_dataframes, axis=0, ignore_index=True)

df

输出：

如果您想按列附加不同的文件：

all_dataframes = []
for idx, file_name in enumerate(files):
    df_temp = pd.read_csv(file_name, delimiter=',', header=None)
    column_prefix = 'f_' + str(idx) + '_'
    df_temp.columns = [column_prefix + c for c in column_names]
    df_temp[column_prefix + 'creation_time'] = time.strftime('%m/%d/%Y', time.gmtime(os.path.getmtime(file_name)))
    all_dataframes.append(df_temp)

pd.concat(all_dataframes, axis=1)

输出：

Answer 2

我自己尝试了以下方法：

df['Creation_Time'] = pd.to_datetime(os.path.getctime(path + filename), format='mixed').strftime('%m/%d/%Y')

获取文件创建日期 - 添加到 read_csv 上的数据框列

问题描述投票：0回答：2

2个回答

最新问题

获取文件创建日期 - 添加到 read_csv 上的数据框列

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2