为什么它会将tags_line写入文件两次?

问题描述 投票:0回答:1

说朋友,

我正在构建一个非常简单的解码器,但我对 Python 还很陌生,我不知道为什么它会将“tag_line”(一个列表)的值写入文件“filen_e”两次。 列表“tags_line 应该用于捕获解码器不知道的单词,并将它们存储在单独的文件中。

这是整个编码器功能:

# the encoder function ;; filen_e is the file that stores words that aren't in the dictionary
def encoder(lst, filen=dictionary, filen_e=filen_e, verbose=False):  # wordlist / dictionary ######## >>>> NEEDS FIXING !!!!!!
encoded = []
counter = 1
counter2 = 0
tags = []
tags_nrs = []
tags_line = []

score = False
check = False
with open(filen, "r", encoding="utf-8") as f:
    comprehension = f.read().splitlines()
    if verbose == True:
        print(f"Type comprehension: {type(comprehension)}{comprehension}\n")
    for i in range(len(lst)):
        for j in range(len(comprehension)):
            if lst[i] == comprehension[j]:
                # match
                encoded.append(counter)
                counter = 1
                score = True
                break
            if counter >= len(comprehension):
                counter = 1
                print(f"- appending: 0")
                encoded.append(0) # in case there is no match
                print(f"- appending: {i},{lst[i]}")
                tags_line.append([i, lst[i]])  # for storing in file "filen_e"
                break
            counter += 1
        counter2 += 1
    f.close()
    if counter2 > 1:
        s = "s."
    else:
        s = "."
    if verbose == True:
        # output for the function
        print(f"+(Encoder)")
        print(f"Processed: ({counter2}) token{s}")
        print(f"String: {lst}")
        print(f"Encoded string: {encoded}\n")
    logmsg = f"encoder({counter2}) :: 'Processed: ({counter2}) token{s}'"
    log(logmsg)
    if check==True:
        logmsg = f"encoder({counter2}) :: 'Can not process: ({tags}) token{s}'"
        log(logmsg)
with open(filen_e, "a", encoding="utf-8") as f:
    # tags_line = f"tags: {tags_nrs}:{tags},"
    print(f"- unknown tags: {tags_line}")
    f.write(str(tags_line))
    tags_line = []
    f.close()
return encoded

它记录的内容在程序的输出中返回为:

+(beta_encoder):
- appending: 0
- appending: 2,homia
- appending: 0
- appending: 4,ducken
- appending: 0
- appending: 6,pls
- unknown tags: [[2, 'homia'], [4, 'ducken'], [6, 'pls']]
    +(encode).reverse encoding

[6195, 5879, 0, 6085, 0, 3700, 0]

最后一个字符串是编码字符串,其中数字 0 是 3 倍,这表示它捕获了一个它不知道的单词,如 - 未知标签所示:[[2, 'homia'] 等。

然后文件“filen_e”的内容(这些未知标签应该被写入其中)具有以下内容:

[[2, 'homia'], [4, 'ducken'], [6, 'pls']][[2, 'homia'], [4, 'ducken'], [6, 'pls']]

ChatGPT 告诉我应该在函数开头清除tags_line 的值,但情况已经如此,因为我在函数开头将其初始化为空列表。

这实在是令人费解。

注意:文件中写的是“a”<- append mode, but also when I delete the file it writes it twice. I even reinitialize it directly after it is written, so it can't be the second call to the encoder function.

此外,“编码器”函数仅在我的其他“引导”函数中调用一次。

### bootstrap function ### next encode some text ###
encoded_seq = encode(lst=inputs.split(' '), fwd=False, verbose=True)
print("\n", encoded_seq, "\n")

### FUNCTIONS ###
python file encoding
1个回答
0
投票

没关系,我觉得真的很愚蠢,当你将“编码器”设置为 True 时,我的其他函数使用相同的编码器函数。

def store(txt, encoders=True, filen=filename, endl=storeUsingNewline):
    ### as such... duh moment
© www.soinside.com 2019 - 2024. All rights reserved.