我正在读一个大约有7到8行的csv文件,它是我文件的描述。我使用以下代码进入第一列:
list_of_files = glob.glob('C:/payment_reports/*csv') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print (latest_file)
line_count = None
for row in csv.reader(open(latest_file)):
if row[0] == 'date/time':
print (row)
break
else:
print("{} not found".format('name'))
我打算纠正行,因为打印的行是:
['date/time', 'settlement id', 'type', 'order id', 'sku', 'description', 'quantity', 'marketplace', 'fulfillment', 'order city', 'order state', 'order postal', 'product sales', 'shipping credits', 'gift wrap credits', 'promotional rebates', 'sales tax collected', 'Marketplace Facilitator Tax', 'selling fees', 'fba fees', 'other transaction fees', 'other', 'total']
现在如何将列+所有行保存为新的csv?我有一个line_count,但在我尝试使用新变量之前,我确信csv中有函数使用行的索引,我可以使用它来使事情变得更简单。你们有什么建议是最好的方法。
解决方案:感谢@bruno desthuilliers
list_of_files = glob.glob('C:/payment_reports/*csv') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print (latest_file)
with open(latest_file, "r") as infile:
reader = csv.reader(infile)
for row in reader:
if row[0] == 'date/time':
print (row)
break
else:
print("{} not found".format('name'))
break
with open("C:/test.csv", "w") as outfile:
writer = csv.writer(outfile)
writer.writerow(row) # headers
writer.writerows(reader) # remaining rows
找到标题行后,可以将其写入剩余行并输出到outfile:
with open(latest_file, "rb") as infile:
reader = csv.reader(infile)
for row in reader:
if row[0] == 'date/time':
break
else:
print("{} not found".format('name'))
return
with open("path/to/new.csv", "wb") as outfile:
writer = csv.writer(outfile)
writer.writerow(row) # headers
writer.writerows(reader) # remaining rows
csv.reader
是一个迭代器。每次调用.next
时,它都会从csv读取一行。
这是文档:http://docs.python.org/2/library/csv.html。
迭代器对象实际上可以从一个太大而无法一次读取的源返回值。使用带有迭代器的for循环有效地在每次循环时调用.next
。希望这可以帮助?