如何打印仅包含列表中字母的单词?

问题描述 投票:0回答:3

我最近一直在尝试在 Python 3 中创建一个程序,该程序将读取包含 23005 个单词的文本文件,然后用户将输入一个 9 个字符的字符串,程序将使用它来创建单词并将它们与单词进行比较在文本文件中。

我想打印包含 4-9 个字母并且还包含列表中间的字母的单词。例如,如果用户输入字符串“anitskem”,则该单词中必须存在第五个字母“s”。

这是我自己已经走了多远:

# Open selected file & read
filen = open("svenskaOrdUTF-8.txt", "r")

# Read all rows and store them in a list
wordList = filen.readlines()

# Close File
filen.close()

# letterList index
i = 0
# List of letters that user will input
letterList = []
# List of words that are our correct answers
solvedList = []

# User inputs 9 letters that will be stored in our letterList
string = input(str("Ange Nio Bokstäver: "))
userInput = False

# Checks if user input is correct
while userInput == False:
   # if the string is equal to 9 letters
   # insert letter into our letterList.
   # also set userInput to True
    if len(string) == 9:
        userInput = True
        for char in string:
            letterList.insert(i, char)
            i += 1

    # If string not equal to 9 ask user for a new input
    elif len(string) != 9:
        print("Du har inte angivit nio bokstäver")
        string = input(str("Ange Nio Bokstäver: "))

# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList

for word in wordList:
    for char in word:
        if char in letterList:
            if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
                print("Char:", word)
                solvedList.append(word)

我遇到的问题是,它不是打印包含我的

letterList
字母的单词,而是打印出包含至少一个来自我的
letterList
字母的单词。这也意味着某些单词会被多次打印,例如,如果这些单词包含
letterList
中的多个字母。

我已经尝试解决这些问题有一段时间了,但我似乎无法弄清楚。我还尝试使用排列来创建列表中字母的所有可能组合,然后将它们与我的

wordlist
进行比较,但是我认为解决方案是考虑到必须创建的组合数量而减慢速度。

    # For each word in wordList
    # and for each char within that word
    # check if said word contains a letter from our letterList
    # if it does and meets the requirements to be a correct answer
    # add said word to our solvedList
    for word in wordList:
        for char in word:
            if char in letterList:
                if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
                    print("Char:", word)
                    solvedList.append(word)

另外,由于我对 Python 还不太熟悉,如果您有任何一般性技巧可以分享,我将非常感激。

python python-3.x for-loop comparison iteration
3个回答
1
投票

您获得多个单词主要是因为您迭代给定单词中的每个字符,并且如果该字符位于您附加并打印的

letterList
中。

相反,基于单词而不是字符进行迭代,同时还使用

with
上下文管理器自动关闭文件:

with open('american-english') as f:
    for w in f:
        w = w.strip()
        cond = all(i in letterList for i in w) and letterList[4] in w
        if 9 > len(w) >= 4 and cond:
            print(w)

这里

cond
用于修剪
if
语句,
all(..)
用于检查单词中的每个字符是否在
letterList
中,
w.strip()
用于删除多余的空格。

此外,要在输入为

letterList
字母时填充
9
不要使用
insert
。相反,只需将字符串提供给
list
,列表就会以类似但明显更快的方式创建:

这个:

if len(string) == 9:
    userInput = True
    for char in string:
        letterList.insert(i, char)
        i += 1

可以写成:

if len(string) == 9:
    userInput = True
    letterList = list(string)

通过这些更改,不需要初始

open
readlines
,也不需要初始化
letterList


0
投票

你可以尝试这个逻辑:

for word in wordList:
    # if not a valid work skip - moving this check out side the inner for-each will improve performance
    if len(word) < 4 or len(word) > 9 or letterList[4] not in word:
        continue
    # find the number of matching words
    match_count = 0
    for char in word:
        if char in letterList:
            match_count += 1
    # check if total number of match is equal to the word count
    if match_count == len(word):
        print("Char:", word)
        solvedList.append(word)

0
投票

您可以使用 lambda 函数来完成此任务。 我只是在这里放置一个 POC,留给您将其转换为完整的解决方案。

filen = open("test.text", "r")
word_list = filen.read().split()
print("Enter your string")
search_letter = raw_input()[4]

solved_list  = [ word for word in word_list if  len(word) >= 4 and len(word) <= 9 and search_letter in word]
print solved_list
© www.soinside.com 2019 - 2024. All rights reserved.