发现子字符串在字符串中连续出现的次数最多

Question

我有一个长字符串，我不仅试图查找较大字符串中是否存在这些字符的子字符串，而且还试图找到最长的连续实例。

例如，在下面的代码片段中，我发现我可以使用“ count”来查看子字符串b在a中出现多少次。这个结果是5。但是，我要确定的是最长的连续运行，应该是3（其中“ abc”在中间背对背出现）。我很难理解这一逻辑。任何建议，将不胜感激。

a = "abcxyzabcabcabcxyzabcxyz"

b = "abc"

total = a.count(b)

print(total)

Answer 1

使用while循环，这应该非常简单：

def func(a, b): 
    n = 1 
    while b*n in a: 
        n += 1 
    return n - 1

Answer 2

一种可能且幼稚的解决方案是使用python index函数来标识子字符串的最接近索引。从那里您可以继续继续向前搜索子字符串，直到找到不再出现的子字符串为止，然后再次调用index以向前跳过。

示例：

a = "abcxyzabcabcabcxyzabcxyz"
b = "abc"

curr_index = a.index(b)
longest_count = 0
current_count = 0

while curr_index < len(a):
    if a[curr_index : curr_index + len(b)] == b:
        curr_index += len(b)
        current_count += 1
    else:
        if longest_count < current_count:
            longest_count = current_count
        try:
            curr_index = a.index(b, curr_index)
        except ValueError:
            # Substring no longer found in string slice
            break
        current_count = 0

if longest_count < current_count:
    longest_count = current_count

print(longest_count)

这只会返回最长的重复计数，但不会返回开始的位置。但是，添加该功能是微不足道的。

Answer 3

用适当的索引继续调用a.index上的b。如果索引是子集的开始，那么您将处于同一轮次。否则，开始新的运行：

def longest_run(string, pattern):
    longest = 0
    current = 0
    start = 0
    while True:
        try:
            ind = string.index(pattern, start)
            if ind == start:
                current += 1
            else:
                if current > longest:
                    longest = current
                current = 1
            start += len(pattern)
        except ValueError:
            return longest

Answer 4

[一种有效的方法是将re.findall用于匹配b一次或多次的模式（使用re.escape防止b解释为正则表达式），然后将返回的字符串映射到len并将它们传递给max以获取最长匹配的长度，然后将该长度除以b的长度即可得到重复b的次数：

import re
max(map(len, re.findall('(?:%s)+' % re.escape(b), a))) // len(b)

发现子字符串在字符串中连续出现的次数最多

问题描述投票：2回答：4

4个回答

最新问题

发现子字符串在字符串中连续出现的次数最多

问题描述 投票：2回答：4

4个回答

最新问题

问题描述投票：2回答：4