读取 hh:mm:ss 与浮点数混合的文本文件

Question

我有一个这样的txt文件：

index   timestamp   polarisation current (A)    signal (V)  head temperature (°C)   head relat.humidity (%RH)   MUGS temperature (°C)   laser voltage (V)   laser current (A)   driver temperature (°C)
0   16:11:24    0.4 0.0006019   26.51   43.5    32.0    11.37   0.3922  26.5
1   16:11:29    0.402   0.0006286   26.51   43.5    32.5    11.41   0.3972  31.5
2   16:11:34    0.404   0.0005828   26.51   43.5    32.5    11.42   0.4048  32.5
3   16:11:38    0.406   0.0006139   26.51   43.5    32.5    11.39   0.3984  32.5

完整文件在这里（ https://www.dropbox.com/scl/fi/2izelcpjpqr8yckowtgat/35deg_400mA_640mA_120pts_-0.015mA_mod1_LAS5000Hz_LI10000Hz_meth2_0.15us_-14h02m33s_pulse20.txt?rlkey= 3xwxthp6v48deyeob0gj6uuo7&st=xta4m25q&dl=0）

我用以下方式读取该文件：

with open(universal_path,'rt'): 
        values = np.genfromtxt( universal_path, delimiter="", skip_header = 1, encoding='unicode_escape')

但问题是第二列充满了 NaN : 我尝试写 dtype = None。但现在它只读取 txt 文件的第一列：

我试着写

dtype = [int, str, float, float, float, float, float, float, float, float]

。但它也只读取第一列。

我该如何将第二列作为字符串读取？

Answer 1

这是使用 duckdb 的基本转换（在我看来，我已将列名改成了“合理”的名称），可以使用 numpy 函数来操作生成的字典

import duckdb as ddb

conn =ddb.connect() #in memory db, disappears when closed

conn.execute("""create table camile as SELECT *
    FROM read_csv('test.csv',
    delim = '\t',
    header = true,
    columns = {
    'index': 'integer',
    'timestamp': 'time',
    'polarisation-current': 'double',
    'signal': 'double',
    'head-temp': 'double',
    'head-relat-humidity': 'double',
    'MUGS-temp': 'double',
    'laser-voltage': 'double',
    'laser-current': 'double',
    'driver-temp': 'double'
    } ); """
)

myData = conn.sql("SELECT * from camile").fetchnumpy()

conn.close()

# print first 5 items ... 
for key, values in myData.items():
    print(f"{key}: {values[:5]}")

#
#  read https://duckdb.org/docs/guides/python/export_numpy.html
#

读取 hh:mm:ss 与浮点数混合的文本文件

问题描述投票：0回答：1

1个回答

最新问题

读取 hh:mm:ss 与浮点数混合的文本文件

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1