我有一个.dat文件,我对该文件的创建方式,使用的定界符以及有关它的任何细节一无所知。我只是有其对应的mdf和csv文件。就这样。python中有什么方法可以读取此.dat文件吗?
我尝试过的几种方法:
file = "736_2_PerformanceCurve_(23_0C)_(13_5V).dat"
datContent = [i.strip().split() for i in open(file, encoding='latin1').readlines()]
datContent
给出输出
[['|CF,2,1,1;|CK,1,3,1,1;'],
['|NO,1,7,1,0,,0,;'],
['|NL,1,10,1252,0x407;'],
['|CT,1,41,0,6,Bench#,24,Korrosionstest', '15A046-01,0,;'],
['|CT,1,30,0,11,StartOfTest,8,06/30/17,0,;'],
['|CT,1,58,0,10,ResultPath,36,c:\\korrosionstest\\daten\\#170161-OR02,0,;'],
['|CT,1,59,0,11,GraphicPath,36,c:\\korrosionstest\\daten\\#170161-OR02,0,;'],
['|CT,1,31,0,15,GraphicBaseName,5,736_2,0,;'],
['|CT,1,26,0,10,PartNumber,5,736_2,0,;'],
['|CT,1,31,0,9,VA-Nr.', 'GS,11,170161-OR02,0,;'],
['|CT,1,62,0,9,VA-Nr.',
'CC,42,TO_ENV_2017_G2_C1_Platform_CC-122164-03-08,0,;'],
['|CT,1,24,0,6,Tester,8,Behrendt,0,;'],
['|CT,1,32,0,15,Test', 'Department,6,GS/ETR,0,;'],
['|CG,1,5,1,1,1;'],
['|CD,1,16,1E-2,1,1,s,0,0,0;'],
['|NT,1,27,30,', '6,2017,14,25,15.8050001;'],
['|CC,1,3,1,1;'],
['|CP,1,16,1,2,4,16,0,0,1,0;'],
['|Cb,1,33,1,0,1,1,0,11718,0,11718,1,5E-3,0,;'],
['|CR,1,30,1,6.103888176768602E-3,0,1,1,A;'],
['|CN,1,28,0,0,0,16,ai_iB1_Strom_ECU,0,;'],
['|CG,1,5,1,1,1;'],
['|CD,1,16,1E-2,1,1,s,0,0,0;'],
['|NT,1,27,30,', '6,2017,14,25,15.8050001;'],
['|CC,1,3,1,1;'],
['|CP,1,16,2,2,4,16,0,0,1,0;'],
['|Cb,1,37,1,0,2,1,11718,11718,0,11718,1,5E-3,0,;'],
['|CR,1,30,1,3.662332906061161E-3,0,1,1,V;'],
['|CN,1,31,0,0,0,19,ai_iB1_Spannung_UBB,0,;'],
相同的对应csv文件
如果我查看该文件,它看起来就像是特定格式。
一个数据块以|
开始,以;
结束。在数据块中,数据用,
分割。基本上就像CSV,但换行符为;
。
现在借助正则表达式,您可以像这样读取此数据:
import re
with open("resources/input.dat") as f:
lines = f.readlines()
text = "".join(lines)
regex = r"\|(.*?);"
matches = re.finditer(regex, text, re.MULTILINE | re.DOTALL)
data = []
for matchNum, match in enumerate(matches, start=1):
for group in match.groups():
data.append(group.split(","))
for d in data:
print(d)
|CF,2,1,1;|CK,1,3,1,1;
|NO,1,7,1,0,,0,;
|CT,1,41,0,6,Bench,24,Korrosionstest', '15A046-01,0,
otherline_data;
['CF', '2', '1', '1']
['CK', '1', '3', '1', '1']
['NO', '1', '7', '1', '0', '', '0', '']
['CT', '1', '41', '0', '6', 'Bench', '24', "Korrosionstest'", " '15A046-01", '0', '\notherline_data']
您可以看到,即使数据块没有在新行结束,您仍然可以获取数据,直到定义的结束标记;
。