像 http://www.easysurf.cc/cnvert18.htm 和 http://www.calculatorsoup.com/calculators/conversions/numberstowords.php 这样的网站尝试将数字字符串转换为英文字符串,但是他们提供自然的声音输出。
例如,在http://www.easysurf.cc/cnvert18.htm:
[in]: 100456
[out]: one hundred thousand four hundred fifty-six
这个网站好一点,http://www.calculator.org/calculate-online/mathematics/text-number.aspx:
[in]: 100456
[out]: one hundred thousand, four hundred and fifty-six
[in]: 10123124001
[out]: ten billion, one hundred and twenty-three million, one hundred and twenty-four thousand, one
但它在某些时候会中断:
[in]: 10000000001
[out]: ten billion, , , one
我已经编写了自己的版本,但它涉及很多规则,并且上限为 10 亿,来自 http://pastebin.com/WwFCjYtt:
import codecs
def num2word (num):
ones = {1:"one",2:"two",3:"three",4:"four",
5:"five",6:"six",7:"seven",8:"eight",
9:"nine",0:"zero",10:"ten"}
teens = {11:"eleven",12:"twelve",13:"thirteen",
14:"fourteen",15:"fifteen"}
tens = {2:"twenty",3:"thirty",4:"forty",
5:"fifty",6:"sixty",7:"seventy",
8:"eighty",9:"ninety"}
lens = {3:"hundred",4:"thousand",6:"hundred",7:"million",
8:"million", 9:"million",10:"billion"#,13:"trillion",11:"googol",
}
if num > 999999999:
return "Number more than 1 billion"
# Ones
if num < 11:
return ones[num]
# Teens
if num < 20:
word = ones[num%10] + "teen" if num > 15 else teens[num]
return word
# Tens
if num > 19 and num < 100:
word = tens[int(str(num)[0])]
if str(num)[1] == "0":
return word
else:
word = word + " " + ones[num%10]
return word
# First digit for thousands,hundred-thousands.
if len(str(num)) in lens and len(str(num)) != 3:
word = ones[int(str(num)[0])] + " " + lens[len(str(num))]
else:
word = ""
# Hundred to Million
if num < 1000000:
# First and Second digit for ten thousands.
if len(str(num)) == 5:
word = num2word(int(str(num)[0:2])) + " thousand"
# How many hundred-thousand(s).
if len(str(num)) == 6:
word = word + " " + num2word(int(str(num)[1:3])) + \
" " + lens[len(str(num))-2]
# How many hundred(s)?
thousand_pt = len(str(num)) - 3
word = word + " " + ones[int(str(num)[thousand_pt])] + \
" " + lens[len(str(num))-thousand_pt]
# Last 2 digits.
last2 = num2word(int(str(num)[-2:]))
if last2 != "zero":
word = word + " and " + last2
word = word.replace(" zero hundred","")
return word.strip()
left, right = '',''
# Less than 1 million.
if num < 100000000:
left = num2word(int(str(num)[:-6])) + " " + lens[len(str(num))]
right = num2word(int(str(num)[-6:]))
# From 1 million to 1 billion.
if num > 100000000 and num < 1000000000:
left = num2word(int(str(num)[:3])) + " " + lens[len(str(num))]
right = num2word(int(str(num)[-6:]))
if int(str(num)[-6:]) < 100:
word = left + " and " + right
else:
word = left + " " + right
word = word.replace(" zero hundred","").replace(" zero thousand"," thousand")
return word
print num2word(int(raw_input("Give me a number:\n")))
我怎样才能让我写的脚本接受
> billion
?
还有其他方法可以获得相同的输出吗?
我的代码可以写得不那么冗长吗?
解决此问题的更通用方法是使用重复除法(即
divmod
),并且仅对必要的特殊/边缘情况进行硬编码。
例如,
divmod(1034393, 1000000) -> (1, 34393)
,这样您就有效地找到了数百万,并留下余数以供进一步计算。
可能更具说明性的示例:
divmod(1034393, 1000) -> (1034, 393)
,它允许您从右侧一次去掉 3 位十进制数字组。
在英语中,我们倾向于将数字分成三组,并且也适用类似的规则。这应该参数化而不是硬编码。例如,“303”可以是三亿三百万、三十三千或三百零三。除了后缀之外,逻辑应该是相同的,具体取决于您所在的位置。编辑:看起来这是由于递归而存在的。
这是我所说的这种方法的部分示例,使用生成器并对整数进行操作,而不是到处做大量的
int(str(i)[..])
。
say_base = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven',
'eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen',
'fifteen', 'sixteen', 'seventeen', 'eighteen', 'nineteen']
say_tens = ['', '', 'twenty', 'thirty', 'forty', 'fifty', 'sixty', 'seventy',
'eighty', 'ninety']
def hundreds_i(num):
hundreds, rest = divmod(num, 100)
if hundreds:
yield say_base[hundreds]
yield ' hundred'
if 0 < rest < len(say_base):
yield ' and '
yield say_base[rest]
elif rest != 0:
tens, ones = divmod(rest, 10)
yield ' and '
yield say_tens[tens]
if ones > 0:
yield '-'
yield say_base[ones]
assert "".join(hundreds_i(245)) == "two hundred and forty-five"
assert "".join(hundreds_i(999)) == 'nine hundred and ninety-nine'
assert "".join(hundreds_i(200)) == 'two hundred'
由于当前接受的答案与“零”有一些问题,我在这里提供另一个答案:
def number_to_text(n, *, hyphen="-", joiner="and", comma=","):
unitNames = ["one", "two", "three", "four", "five", "six", "seven", "eight",
"nine", "ten", "eleven", "twelve", "thirteen", "fourteen",
"fifteen", "sixteen", "seventeen", "eightteen", "nineteen"]
tensNames = ["twenty", "thirty", "forty", "fifty",
"sixty", "seventy", "eighty", "ninety"]
tripletNames = ["", "thousand"] + [s + "illion"
for s in ["m", "b", "tr", "quadr", "quint",
"sext", "sept", "oct", "non"]
] + [s + "decillion"
for s in ["", "un", "duo", "tre", "quattuor",
"quin", "sex", "septen", "octo", "novem"]
] + ["vigintillion"] # add as needed....
def triplets(n):
for tripletName in tripletNames:
num = n % 1000
n //= 1000
if num == 0:
continue
hundreds = num // 100
num %= 100
tens = num // 10 if num > 19 else 0
num -= tens * 10
yield ((unitNames[hundreds-1] + " hundred " if hundreds else "")
+ (joiner + " " if joiner and (n or hundreds) and (tens or num) else "")
+ (tensNames[tens-2] + (hyphen if num else " ") if tens else "")
+ (unitNames[num-1] + " " if num else "")
+ tripletName).strip()
if n == 0:
return
raise ValueError("number too large for this converter")
return (comma + " ").join(reversed(list(triplets(n)))) if n else "zero"
调用示例:
print(number_to_text(1234567890123456789, hyphen="-", joiner="and", comma=","))
输出:
一五千万、二百三十四万亿、五百六十七万亿、八千九百亿、一亿两千三百万、四十五万六千、七百八十九
如果您不喜欢逗号或“and”,请为相应选项传递空字符串:
print(number_to_text(101000001, hyphen="-", joiner="", comma=""))
一亿一百万一
注意:这使用了短刻度。如果您需要长刻度,请相应地更新列表
tripletNames
。