这是我之前发布的这个问题的后续:
如果单词的第一个字母是元音,如何返回 1,否则返回 0。映射器(MapReduce)问题
只有当单词的最后一个字母是元音时,我才能返回 1。
我目前拥有的是:
import sys
import re
pattern = re.compile("^[a-z]+$") # matches purely alphabetic words
starting_vowels = re.compile("(^[aeiouAEIOU])") # matches starting vowels
ending_vowels = re.compile("[aeiouAEIOU]$") # matches ending vowels
starting_vowel_match = 0
ending_vowel_match = 0
for line in sys.stdin:
line = line.strip() # removes leading and trailing whitespace
words = line.lower().split() # splits the line into words and converts to lowercase
for word in words:
starting_vowel_match = 1 if starting_vowels.match(word[0]) else 0
# ternary operator, word[0] is the first letter of the word
ending_vowel_match = 1 if ending_vowels.match(word[-1]) else 0
for letter in word:
if pattern.match(letter):
print("%s 1" % letter, starting_vowel_match, ending_vowel_match)
starting_vowel_match = 0
ending_vowel_match = 0
我得到的输出是当我使用带有字符串“its a beautiful life”的文本文件运行它时:
i 1 1 0
t 1 0 0
s 1 0 0
a 1 1 1
b 1 0 0
e 1 0 0
a 1 0 0
u 1 0 0
t 1 0 0
i 1 0 0
f 1 0 0
u 1 0 0
l 1 0 0
l 1 0 1
i 1 0 0
f 1 0 0
e 1 0 0
似乎最后一个 l 字符在第四列中返回 1,因为字符串 life 中的最后一个字母是 e。我想要的只是最后一个 e 在第四列中输出 1,因为它是单词的最后一个字母。
我想要的输出是:
i 1 1 0
t 1 0 0
s 1 0 0
a 1 1 1
b 1 0 0
e 1 0 0
a 1 0 0
u 1 0 0
t 1 0 0
i 1 0 0
f 1 0 0
u 1 0 0
l 1 0 0
l 1 0 0
i 1 0 0
f 1 0 0
e 1 0 1
试试这个:
import sys
for line in sys.stdin:
line = line.strip()
for word in line.lower().split():
if len(word) == 1:
print(word[0], 1, *((1, 1) if word[0] in 'aeiou' else (0, 0)))
else:
print(word[0], 1, 1 if word[0] in 'aeiou' else 0, 0)
print(*(f'{letter} 1 0 0' for letter in word[1: -1]), sep='\n')
print(word[-1], 1, 0, 1 if word[-1] in 'aeiou' else 0)
输出(带
line = "its a beautiful life"
):
i 1 1 0
t 1 0 0
s 1 0 0
a 1 1 1
b 1 0 0
e 1 0 0
a 1 0 0
u 1 0 0
t 1 0 0
i 1 0 0
f 1 0 0
u 1 0 0
l 1 0 0
l 1 0 0
i 1 0 0
f 1 0 0
e 1 0 1
我认为您应该将单词传递给正则表达式匹配,而不仅仅是最后一个字母。您的正则表达式已经在搜索以元音开头或结尾的单词。
所以你想要:
starting_vowel_match = 1 if starting_vowels.match(word) else 0
# ternary operator, word[0] is the first letter of the word
ending_vowel_match = 1 if ending_vowels.match(word) else 0
如果问题是询问映射器,我假设您需要某种功能性解决方案?这里我以 Lorem Ipsum 的第一句为例:
paragraph = """Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua."""
cleaned_paragraph = ''.join(i for i in paragraph if i not in ['.', ',', '?', '!', ':', ';']) # remove puctuation
words = cleaned_paragraph.split() # get individual words
def mapper(word: str) -> int:
return 1 if word[-1] in 'aeiou' else 0
res = list(map(mapper, words))
出局:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1]
更换你的
for letter in word:
if pattern.match(letter):
print("%s 1" % letter, starting_vowel_match, ending_vowel_match)
starting_vowel_match = 0
ending_vowel_match = 0
这个:
for i, letter in enumerate(word):
if pattern.match(letter):
print(
"%s 1" % letter,
starting_vowel_match * (i == 0),
ending_vowel_match * (i == len(word) - 1)
)
随着
enumerate
你知道你在这个词中的位置。
使用正则表达式似乎有点矫枉过正。你只需要几个简单的函数——像这样:
VOWELS = set('aeiouAEIOU')
def is_leading_vowel(word):
return 1 if word and word[0] in VOWELS else 0
def is_trailing_vowel(word):
return 1 if word and word[-1] in VOWELS else 0
检查最后一个字母是元音后需要修改代码打印单词
如果'ending_vowel_match'为真,我们只打印单词,这意味着单词的最后一个字母是元音。
如果 'ending_vowel_match' 为假,我们仍然循环遍历单词的字母。但是,我们在第四列打印 0 表示它不是单词的最后一个字母。
import sys
import re
pattern = re.compile("^[a-z]+$")
# matches purely alphabetic words
starting_vowels = re.compile("(^[aeiouAEIOU])")
# matches starting vowels
ending_vowels = re.compile("[aeiouAEIOU]$")
# matches ending vowels
for line in sys.stdin:
line = line.strip() # removes leading and trailing whitespace
words = line.lower().split()
# splits the line into words and converts to lowercase
for word in words:
starting_vowel_match = 1 if starting_vowels.match(word[0]) else 0
# ternary operator, word[0] is the first letter of the word
ending_vowel_match = 1 if ending_vowels.match(word[-1]) else 0
if ending_vowel_match:
for letter in word:
if pattern.match(letter):
print("%s 1 %d 1" % (letter, starting_vowel_match))
starting_vowel_match = 0
ending_vowel_match = 0
else:
for letter in word:
if pattern.match(letter):
print("%s 1 %d 0" % (letter, starting_vowel_match))
starting_vowel_match = 0