我是Python新手,正在尝试在alice_in_worderland.txt中找到最大的单词。我认为我已经设置了一个很好的系统(“见下文”),但我的输出返回一个“单词”,并用破折号连接多个单词。有什么办法可以删除文件输入中的破折号吗?如需文本文件,请访问此处
文本文件示例:
这非常重要,”国王转向陪审团说道。他们是 刚开始把这写在他们的石板上,当白队 兔子打断道:“不重要,陛下的意思是,”他 用很恭敬的语气说道,却皱着眉头对他做鬼脸 他说话的时候。 “当然,我的意思是不重要,”国王急忙说道。 说道,然后低声自言自语,重要——不重要—— 不重要——重要——”就好像他在尝试哪个词听起来一样 最好的。”
代码:
#String input
with open("alice_in_wonderland.txt", "r") as myfile:
string=myfile.read().replace('\n','')
#initialize list
my_list = []
#Split words into list
for word in string.split(' '):
my_list.append(word)
#initialize list
uniqueWords = []
#Fill in new list with unique words to shorten final printout
for i in my_list:
if not i in uniqueWords:
uniqueWords.append(i)
#Legnth of longest word
count = 0
#Longest word place holder
longest = []
for word in uniqueWords:
if len(word)>count:
longest = word
count = len(longest)
print longest
>>> import nltk # pip install nltk
>>> nltk.download('gutenberg')
>>> words = nltk.corpus.gutenberg.words('carroll-alice.txt')
>>> max(words, key=len) # find the longest word
'disappointment'
这是使用
re
和 mmap
的一种方法:
import re
import mmap
with open('your alice in wonderland file') as fin:
mf = mmap.mmap(fin.fileno(), 0, access=mmap.ACCESS_READ)
words = re.finditer('\w+', mf)
print max((word.group() for word in words), key=len)
# disappointment
比将文件加载到物理内存更高效。
使用
str.replace
将破折号替换为空格(或您想要的任何内容)。为此,只需在第 3 行的第一个调用之后添加另一个调用来替换:
string=myfile.read().replace('\n','').replace('-', ' ')
从日期时间导入日期时间 从表格导入表格
def计算总成绩(课程作业1,课程作业2,课程作业3,期末考试): 返回 (课程作业1 * 0.10) + (课程作业2 * 0.20) + (课程作业3 * 0.30) + (期末考试 * 0.40)
def确定_类别(总体得分): 如果总体得分 == 100: 返回“奥鲁姆标准” 埃利夫 82 <= overall_score <= 92: return "Upper First" elif 72 <= overall_score < 82: return "First" elif 62 <= overall_score < 72: return "Upper Second (2:1)" elif 52 <= overall_score < 62: return "Lower Second (2:2)" elif 42 <= overall_score < 52: return "Third" elif 32 <= overall_score < 42: return "Condonable Fail" elif 5 <= overall_score < 32: return "Fail" elif overall_score == 0: return "Defecit Opus" else: return "Invalid Score"
def get_student_data(): 学生=[] 对于 _ 在范围(3)中: Student_id = input("输入学生ID(2位数字或'end'停止):") 如果student_id.lower() == '结束': 休息 name = input("请输入学生姓名:") dob = input("请输入出生日期(YYYY-MM-DD):")
# Validate date of birth
try:
dob_date = datetime.strptime(dob, "%Y-%m-%d")
except ValueError:
print("Invalid date format. Please use YYYY-MM-DD.")
continue
coursework1 = float(input("Enter score for Coursework 1: "))
coursework2 = float(input("Enter score for Coursework 2: "))
coursework3 = float(input("Enter score for Coursework 3: "))
final_exam = float(input("Enter score for Final Exam: "))
overall_score = calculate_overall_score(coursework1, coursework2, coursework3, final_exam)
category = determine_category(overall_score)
# Calculate age
age = (datetime.now() - dob_date).days // 365
students.append([student_id, name, dob, age, overall_score, category])
return students
定义主函数(): 学生= get_student_data()
# Display results in a table
print(tabulate(students, headers=["ID", "Name", "D.o.B", "Age", "Overall Score", "Category"], tablefmt="grid"))
# Save to file
with open("students.txt", "w") as f:
f.write(tabulate(students, headers=["ID", "Name", "D.o.B", "Age", "Overall Score", "Category"], tablefmt="grid"))
if name == "main": 主要()