您如何计算文本中的空格数或换行符,以便连续的空格只计为一个?例如,这非常接近我想要的:
string = "This is an example text.\n But would be good if it worked."
counter = 0
for i in string:
if i == ' ' or i == '\n':
counter += 1
print(counter)
但是,不是用15
返回,结果应该只是11
。
默认的str.split()函数将连续运行的空格视为一个。因此,只需拆分字符串,获取结果列表的大小,然后减去一个。
len(string.split())-1
假设您被允许使用Python正则表达式;
import re
print len(re.findall(ur"[ \n]+", string))
快捷方便!
更新:此外,使用[\s]
而不是[ \n]
来匹配任何空格字符。
你可以这样做:
string = "This is an example text.\n But would be good if it worked."
counter = 0
# A boolean flag indicating whether the previous character was a space
previous = False
for i in string:
if i == ' ' or i == '\n':
# The current character is a space
previous = True # Setup for the next iteration
else:
# The current character is not a space, check if the previous one was
if previous:
counter += 1
previous = False
print(counter)
re
到re
scue。
>>> import re
>>> string = "This is an example text.\n But would be good if it worked."
>>> spaces = sum(1 for match in re.finditer('\s+', string))
>>> spaces
11
这会消耗最少的内存,这是构建临时列表的替代解决方案
>>> len(re.findall('\s+', string))
11
如果您只想考虑空格字符和换行符(例如,与制表符相对),请使用正则表达式'(\n| )+'
而不是'\s+'
。
只需存储找到的最后一个字符的字符。每次循环时将其设置为i。然后在你的内部if中,如果找到的最后一个字符也是一个空白字符,不要增加计数器。
您可以遍历数字以将它们用作索引。
for i in range(1, len(string)):
if string[i] in ' \n' and string[i-1] not in ' \n':
counter += 1
if string[0] in ' \n':
counter += 1
print(counter)
注意第一个符号,因为这个结构从第二个符号开始,以防止qazxsw poi。
你可以使用枚举,检查下一个char也不是空格,所以连续的空格只会算作1:
IndexError
您还可以将string = "This is an example text.\n But would be good if it worked."
print(sum(ch.isspace() and not string[i:i+1].isspace() for i, ch in enumerate(string, 1)))
与生成器函数一起使用,跟踪最后一个字符并进行比较:
iter
一个itertools版本:
def con(s):
it = iter(s)
prev = next(it)
for ele in it:
yield prev.isspace() and not ele.isspace()
prev = ele
yield ele.isspace()
print(sum(con(string)))
尝试:
string = "This is an example text.\n But would be good if it worked. "
from itertools import tee, izip_longest
a, b = tee(string)
next(b)
print(sum(a.isspace() and not b.isspace() for a,b in izip_longest(a,b, fillvalue="") ))
您可以使用函数def word_count(my_string):
word_count = 1
for i in range(1, len(my_string)):
if my_string[i] == " ":
if not my_string[i - 1] == " ":
word_count += 1
return word_count
查找连续空格的组:
groupby()