在 Python 中将文本文件转换为字符串

Question

我是Python新手，正在尝试在alice_in_worderland.txt中找到最大的单词。我认为我已经设置了一个很好的系统（“见下文”），但我的输出返回一个“单词”，并用破折号连接多个单词。有什么办法可以删除文件输入中的破折号吗？如需文本文件，请访问此处

文本文件示例：

这非常重要，”国王转向陪审团说道。他们是刚开始把这写在他们的石板上，当白队兔子打断道：“不重要，陛下的意思是，”他用很恭敬的语气说道，却皱着眉头对他做鬼脸他说话的时候。 “当然，我的意思是不重要，”国王急忙说道。说道，然后低声自言自语，重要——不重要—— 不重要——重要——”就好像他在尝试哪个词听起来一样最好的。”

代码：

    #String input
    with open("alice_in_wonderland.txt", "r") as myfile:
        string=myfile.read().replace('\n','')
    #initialize list
    my_list = []
    #Split words into list
    for word in string.split(' '):
        my_list.append(word)
    #initialize list
    uniqueWords = []
    #Fill in new list with unique words to shorten final printout
    for i in my_list:
        if not i in uniqueWords:
            uniqueWords.append(i)
    #Legnth of longest word
    count = 0
    #Longest word place holder
    longest = []
    for word in uniqueWords:
        if len(word)>count:
            longest = word
            count = len(longest)
        print longest

Answer 1

>>> import nltk # pip install nltk
>>> nltk.download('gutenberg')
>>> words = nltk.corpus.gutenberg.words('carroll-alice.txt')
>>> max(words, key=len) # find the longest word
'disappointment'

Answer 2

这是使用

re

和

mmap

的一种方法：

import re
import mmap

with open('your alice in wonderland file') as fin:
    mf = mmap.mmap(fin.fileno(), 0, access=mmap.ACCESS_READ)
    words = re.finditer('\w+', mf)
    print max((word.group() for word in words), key=len)

# disappointment

比将文件加载到物理内存更高效。

Answer 3

使用

str.replace

将破折号替换为空格（或您想要的任何内容）。为此，只需在第 3 行的第一个调用之后添加另一个调用来替换：

string=myfile.read().replace('\n','').replace('-', ' ')

Answer 4

从日期时间导入日期时间从表格导入表格

def计算总成绩（课程作业1，课程作业2，课程作业3，期末考试）：返回 (课程作业1 * 0.10) + (课程作业2 * 0.20) + (课程作业3 * 0.30) + (期末考试 * 0.40)

def确定_类别（总体得分）：如果总体得分 == 100：返回“奥鲁姆标准” 埃利夫 82 <= overall_score <= 92: return "Upper First" elif 72 <= overall_score < 82: return "First" elif 62 <= overall_score < 72: return "Upper Second (2:1)" elif 52 <= overall_score < 62: return "Lower Second (2:2)" elif 42 <= overall_score < 52: return "Third" elif 32 <= overall_score < 42: return "Condonable Fail" elif 5 <= overall_score < 32: return "Fail" elif overall_score == 0: return "Defecit Opus" else: return "Invalid Score"

def get_student_data(): 学生=[] 对于 _ 在范围（3）中： Student_id = input("输入学生ID（2位数字或'end'停止）：") 如果student_id.lower() == '结束': 休息 name = input("请输入学生姓名：") dob = input("请输入出生日期(YYYY-MM-DD)：")

    # Validate date of birth
    try:
        dob_date = datetime.strptime(dob, "%Y-%m-%d")
    except ValueError:
        print("Invalid date format. Please use YYYY-MM-DD.")
        continue
    
    coursework1 = float(input("Enter score for Coursework 1: "))
    coursework2 = float(input("Enter score for Coursework 2: "))
    coursework3 = float(input("Enter score for Coursework 3: "))
    final_exam = float(input("Enter score for Final Exam: "))
    
    overall_score = calculate_overall_score(coursework1, coursework2, coursework3, final_exam)
    category = determine_category(overall_score)
    
    # Calculate age
    age = (datetime.now() - dob_date).days // 365
    
    students.append([student_id, name, dob, age, overall_score, category])

return students

定义主函数（）：学生= get_student_data()

# Display results in a table
print(tabulate(students, headers=["ID", "Name", "D.o.B", "Age", "Overall Score", "Category"], tablefmt="grid"))

# Save to file
with open("students.txt", "w") as f:
    f.write(tabulate(students, headers=["ID", "Name", "D.o.B", "Age", "Overall Score", "Category"], tablefmt="grid"))

if name == "main": 主要（）

在 Python 中将文本文件转换为字符串

问题描述投票：0回答：4

4个回答

最新问题

在 Python 中将文本文件转换为字符串

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4