在 Python 中将文本文件转换为字符串

问题描述 投票:0回答:4

我是Python新手,正在尝试在alice_in_worderland.txt中找到最大的单词。我认为我已经设置了一个很好的系统(“见下文”),但我的输出返回一个“单词”,并用破折号连接多个单词。有什么办法可以删除文件输入中的破折号吗?如需文本文件,请访问此处

文本文件示例:

这非常重要,”国王转向陪审团说道。他们是 刚开始把这写在他们的石板上,当白队 兔子打断道:“不重要,陛下的意思是,”他 用很恭敬的语气说道,却皱着眉头对他做鬼脸 他说话的时候。 “当然,我的意思是不重要,”国王急忙说道。 说道,然后低声自言自语,重要——不重要—— 不重要——重要——”就好像他在尝试哪个词听起来一样 最好的。”

代码:


    #String input
    with open("alice_in_wonderland.txt", "r") as myfile:
        string=myfile.read().replace('\n','')
    #initialize list
    my_list = []
    #Split words into list
    for word in string.split(' '):
        my_list.append(word)
    #initialize list
    uniqueWords = []
    #Fill in new list with unique words to shorten final printout
    for i in my_list:
        if not i in uniqueWords:
            uniqueWords.append(i)
    #Legnth of longest word
    count = 0
    #Longest word place holder
    longest = []
    for word in uniqueWords:
        if len(word)>count:
            longest = word
            count = len(longest)
        print longest
python string python-2.7 input
4个回答
3
投票
>>> import nltk # pip install nltk
>>> nltk.download('gutenberg')
>>> words = nltk.corpus.gutenberg.words('carroll-alice.txt')
>>> max(words, key=len) # find the longest word
'disappointment'

2
投票

这是使用

re
mmap
的一种方法:

import re
import mmap

with open('your alice in wonderland file') as fin:
    mf = mmap.mmap(fin.fileno(), 0, access=mmap.ACCESS_READ)
    words = re.finditer('\w+', mf)
    print max((word.group() for word in words), key=len)

# disappointment

比将文件加载到物理内存更高效。


0
投票

使用

str.replace
将破折号替换为空格(或您想要的任何内容)。为此,只需在第 3 行的第一个调用之后添加另一个调用来替换:

string=myfile.read().replace('\n','').replace('-', ' ')

0
投票

从日期时间导入日期时间 从表格导入表格

def计算总成绩(课程作业1,课程作业2,课程作业3,期末考试): 返回 (课程作业1 * 0.10) + (课程作业2 * 0.20) + (课程作业3 * 0.30) + (期末考试 * 0.40)

def确定_类别(总体得分): 如果总体得分 == 100: 返回“奥鲁姆标准” 埃利夫 82 <= overall_score <= 92: return "Upper First" elif 72 <= overall_score < 82: return "First" elif 62 <= overall_score < 72: return "Upper Second (2:1)" elif 52 <= overall_score < 62: return "Lower Second (2:2)" elif 42 <= overall_score < 52: return "Third" elif 32 <= overall_score < 42: return "Condonable Fail" elif 5 <= overall_score < 32: return "Fail" elif overall_score == 0: return "Defecit Opus" else: return "Invalid Score"

def get_student_data(): 学生=[] 对于 _ 在范围(3)中: Student_id = input("输入学生ID(2位数字或'end'停止):") 如果student_id.lower() == '结束': 休息 name = input("请输入学生姓名:") dob = input("请输入出生日期(YYYY-MM-DD):")

    # Validate date of birth
    try:
        dob_date = datetime.strptime(dob, "%Y-%m-%d")
    except ValueError:
        print("Invalid date format. Please use YYYY-MM-DD.")
        continue
    
    coursework1 = float(input("Enter score for Coursework 1: "))
    coursework2 = float(input("Enter score for Coursework 2: "))
    coursework3 = float(input("Enter score for Coursework 3: "))
    final_exam = float(input("Enter score for Final Exam: "))
    
    overall_score = calculate_overall_score(coursework1, coursework2, coursework3, final_exam)
    category = determine_category(overall_score)
    
    # Calculate age
    age = (datetime.now() - dob_date).days // 365
    
    students.append([student_id, name, dob, age, overall_score, category])

return students

定义主函数(): 学生= get_student_data()

# Display results in a table
print(tabulate(students, headers=["ID", "Name", "D.o.B", "Age", "Overall Score", "Category"], tablefmt="grid"))

# Save to file
with open("students.txt", "w") as f:
    f.write(tabulate(students, headers=["ID", "Name", "D.o.B", "Age", "Overall Score", "Category"], tablefmt="grid"))

if name == "main": 主要()

© www.soinside.com 2019 - 2024. All rights reserved.