我的 CSV 具有以下结构:
h1, h2, h3, h4
"v1","v2","v3","v4,v5,v6"
"a1","a2","a3","a4,a5,a6"
"s1","s2","s3","s4,s5,s6"
标题名称之间有空格,用逗号分隔,但有些列内部有逗号,这就是为什么每列都用引号覆盖的原因
我跑步:
df= pd.read_csv('path/test.csv', delimiter=',', quotechar='"')
但它又回来了
h1 h2 h3 h4
0 "v1","v2","v3","v4,v5,v6" NaN NaN NaN
1 "a1","a2","a3","a4,a5,a6" NaN NaN NaN
2 "s1","s2","s3","s4,s5,s6" NaN NaN NaN
而不是:
h1 h2 h3 h4
0 v1 v2 v3 v4,v5,v6
1 a1 a2 a3 a4,a5,a6
2 s1 s2 s3 s4,s5,s6
我做错了什么?
让我帮助您正确阅读此 CSV。问题是 pandas 需要正确处理 逗号分隔符 和 带引号的字符串。
def read_csv_method(file_path):
try:
df = pd.read_csv(file_path,
skipinitialspace=True,
quotechar='"',
quoting=csv.QUOTE_MINIMAL)
return df
except Exception as e:
print(f"Error in method 2: {e}")
return None