如何在python中打开文件，阅读注释（“#”），找到注释后的单词并选择它后面的单词？

Question

我有一个循环遍历文件的函数，如下所示：

"#" XDI/1.0 XDAC/1.4 Athena/0.9.25

"#" Column.4:                      pre_edge

Content

也就是说“#”后面是注释。我的函数旨在读取每一行，如果它以特定单词开头，则选择“：”之后的内容

例如，如果我有这两行。我想通读它们，如果该行以“#”开头并包含单词“Column.4”，则应该存储单词“pre_edge”。

我当前方法的一个示例如下：

with open(file, "r") as f:
        for line in f:
            if line.startswith ('#'):
                word = line.split(" Column.4:")[1]
            else:
                print("n")

我认为我的麻烦特别是在找到以“#”开头的行之后，我如何解析/搜索它？如果它包含所需的单词，则保存其内容。

Answer 1

如果

注释包含str

Column.4:

，如上所述，你可以这样解析。

with open(filepath) as f:
    for line in f:
        if line.startswith('#'):
            # Here you proceed comment lines
            if 'Column.4' in line:
                first, remainder = line.split('Column.4: ')
                # Remainder contains everything after '# Column.4: '
                # So if you want to get first word ->
                word = remainder.split()[0]
        else:
            # Here you can proceed lines that are not comments
            pass

注意

此外，使用

for line in f:

语句而不是

f.readlines()

（如其他答案中提到的）也是一个很好的做法，因为这样您就不会将所有行加载到内存中，而是逐一进行。

Answer 2

您应该首先将文件读入列表，然后再进行操作：

file = 'test.txt' #<- call file whatever you want
with open(file, "r") as f:
    txt = f.readlines()
    for line in txt:
        if line.startswith ('"#"'):
            word = line.split(" Column.4: ")
            try:
                print(word[1])
            except IndexError:
                print(word)
        else:
            print("n")

输出：

>>> ['"#" XDI/1.0 XDAC/1.4 Athena/0.9.25\n']
>>> pre_edge

使用了 try 和 except catch，因为第一行也以 "#" 开头，我们无法将其与您当前的逻辑分开。

此外，作为旁注，在问题中，您的文件中的行以 "#" 开头，并带有引号，因此

startswith()

函数已更改。

Answer 3

with open('stuff.txt', 'r+') as f:
    data = f.readlines()

for line in data:
    words = line.split()
    if words and ('#' in words[0]) and ("Column.4:" in words):
        print(words[-1])

# pre_edge

Answer 4

一般答案已被接受，但是对于这种特殊情况，您似乎正在尝试解析 XDI 格式的 XAS 文件。如果是这样，可以使用 larch 库来读取它，直接生成内部 XAS 谱组对象，并提供所有必要的字段集和可用的分析功能。对于这个库的警告：它的可选图形功能依赖于 wxWidgets，所以人们可能希望避免它。

有两个阅读器函数 read_xdi 和 read_ascii，具体取决于您的输入是否真正符合 XDI 格式。

这是一个最小的代码示例：

from larch.io import read_xdi
group = read_xdi('test_Ni_K_0001.xdi')
dir(group)
> ['angle', 'array_addrs', 'array_labels', 'array_units', ...]
group.energy[:10]
> array([8185.80863445, 8185.97282303, 8186.60880985, 8187.50505437,
       8188.43041501, 8189.35890581, 8190.28537129, 8191.23493112,
       8192.16946217, 8193.11790166])

如何在python中打开文件，阅读注释（“#”），找到注释后的单词并选择它后面的单词？

问题描述投票：0回答：4

4个回答

注意

最新问题

如何在python中打开文件，阅读注释（“#”），找到注释后的单词并选择它后面的单词？

问题描述 投票：0回答：4

4个回答

注意

最新问题

问题描述投票：0回答：4