csv行开头和结尾的额外逗号,如何删除?

问题描述 投票:1回答:4

所以我有一个.csv文件,其中每一行看起来像这样:

,11:00:14,4,5.,93.7,0.01,0.0,7,20,0.001,10,49.3,0.01, ,11:00:15,4,5.,94.7,0.04,0.5,7,20,0.005,10,49.5,0.04,

它应该是这样的:

11:00:14,4,5.,93.7,0.01,0.0,7,20,0.001,10,49.3,0.01 11:00:15,4,5.,94.7,0.04,0.5,7,20,0.005,10,49.5,0.04

我认为这就是为什么大熊猫没有正确创建数据框架的原因。我该怎么做才能删除这些逗号?

生成原始csv文件的代码是

def tsv2csv():

# read tab-delimited file
with open(file_location + tsv_file,'r') as fin:
    cr = csv.reader(fin, delimiter='\t')
    filecontents = [line for line in cr]

# write comma-delimited file (comma is the default delimiter)
# give the exact location of the file
#"newline=''" at the end of the line stops there being spaces between each row
with open(new_csv_file,'w', newline='') as fou:
    cw = csv.writer(fou, quotechar='', quoting=csv.QUOTE_NONE)
    cw.writerows(filecontents)
python pandas csv dataframe comma
4个回答
2
投票

您可以使用usecols指定要导入的列,如下所示:

import pandas as pd

csv_df = pd.read_csv('temp.csv', header=None, usecols=range(1,13))

这将跳过第一个和最后一个空列。


2
投票

尾随逗号对应于缺少的数据。在数据框中加载时,它们会以NaN形式加载,所以你需要做的就是摆脱它,使用dropna或将它们切片 -

df = pd.read_csv('file.csv', header=None).dropna(how='all', axis=1)

要么,

df = pd.read_csv('file.csv', header=None).iloc[:, 1:-1]

df

         1   2    3     4     5    6   7   8      9   10    11    12
0  11:00:14   4  5.0  93.7  0.01  0.0   7  20  0.001  10  49.3  0.01
1  11:00:15   4  5.0  94.7  0.04  0.5   7  20  0.005  10  49.5  0.04

-1
投票

您可以使用strip删除文本开头和结尾处的任何字符,并提供一个字符串,其中包含您要作为参数转义的字符。

x = ',11:00:14,4,5.,93.7,0.01,0.0,7,20,0.001,10,49.3,0.01,'
print x.strip(',')
>11:00:14,4,5.,93.7,0.01,0.0,7,20,0.001,10,49.3,0.01

-1
投票

不确定如果它在你的情况下工作,你有没有尝试导入:

    df = pd.read_csv('filename', sep=';')
© www.soinside.com 2019 - 2024. All rights reserved.