如何在Python中比较两个文本文件的内容

问题描述 投票:0回答:2

您好,我需要编写一个像标题中所写的脚本,所以我想举一个我想要的例子:

file1.txt的内容: 纽约 洛杉矶 迈阿密

file2.txt的内容: 纽约 奥兰多 迈阿密 直流

我只想比较两个不同的文本并打印不同的添加或丢失的元素

如果你不明白我的意思,我的前代码在这里:

from difflib import Differ

from numpy import diff

myfile1 = input("Enter First File's name for compare : ")
myfile2 = input("Enter Second File's name for compare : ")

ch1 = myfile1.split(".")
ch2 = myfile2.split(".")

if ch1[1] == "txt" and ch2[1] == "txt":
    with open(myfile1) as file_1, open(myfile2) as file_2:
        differ = Differ()

        for line in differ.compare(file_1.readlines(), file_2.readlines()):
            print(line)
    
else:
    print("File format Eror !")
python compare
2个回答
0
投票

如果您想比较单个字符,您可以迭代它们:

with open("file1.txt", 'r') as file: # Same thing with file2
    content1 = file.read()
...

像这样:

min_len = min(map(len, (content1, content2)))
for i in range(min_len): # use smaller length
    if (content1[i] != content2[i]):
        # You found a difference between this two characthers
        # Do something
    # content1 has some extra from content1[min_len:], so you do something with it

如果你想比较单词中的字符,你必须

split
之前输入:

content1 = file.read().split(' ')

0
投票

首先读取文件的所有行

with open('file1.txt') as f1:
    a = f1.readlines()
with open('file2.txt') as f2:
    b = f2.readlines()

用于在 python 3.10 或更高版本中读取文件

with (
    open('file1.txt') as f1,
    open('file2.txt') as f2,
):
    a = f1.readlines()
    b = f2.readlines()

现在查看文件

a
b

之间的打印差异
import difflib
a_sample = a[0] # 'New York Los Angeles Miami'
b_sample = b[0] # 'New York Orlando Miami Dc'
diff = difflib.ndiff(a.replace(' ', '\n').splitlines(keepends=True), b.replace(' ', '\n').splitlines(keepends=True))
print(''.join(diff), end="")
  New
  York
+ Orlando
- Los
- Angeles
- Miami+ Miami
?      +
+ Dc

并迭代所有文件:

for file1_line, file2_line in zip(a, b):
    diff = difflib.ndiff(
                  a.replace(' ', '\n').splitlines(keepends=True), 
                  b.replace(' ', '\n').splitlines(keepends=True)
           )
    print(''.join(diff), end="")

difflib 符号的含义是什么:

代码 意义
'-' 序列 1 特有的行
'+' 序列 2 特有的行
'' 两个序列共有的线
'? ' 任一输入序列中均不存在该行

注意:您可以迭代 diff 输出并仅打印

+
-
单词。

python文档:https://docs.python.org/3/library/difflib.html

© www.soinside.com 2019 - 2024. All rights reserved.