[英]Substrings of a string using Python
How many substrings can you make out of a string like abcd
? 你能用abcd
之类的字符串做多少个子串 ?
How can I get all of its substrings: 我怎样才能得到它的所有子串:
['a', 'b', 'c', 'd', 'ab', 'bc', 'cd', 'abc', 'bcd', 'abcd']
Try this: 试试这个:
def consecutive_groups(iterable):
s = tuple(iterable)
for size in range(1, len(s)+1):
for index in range(len(s)+1-size):
yield iterable[index:index+size]
>>> print list(consecutive_groups('abcd'))
['a', 'b', 'c', 'd', 'ab', 'bc', 'cd', 'abc', 'bcd', 'abcd']
And the number of combinations is simply equal to the sum from 1 to the length of the string, which is equivalent to n * (n + 1) / 2
. 并且组合的数量简单地等于从1到字符串长度的总和,其等于n * (n + 1) / 2
。
By the way, if you want to avoid duplicates, you can simply use a locally-defined set in the generator function, like so: 顺便说一句,如果你想避免重复,你可以简单地在生成器函数中使用本地定义的集合,如下所示:
def consecutive_groups(iterable):
s = tuple(iterable)
seen = set()
for size in range(1, len(s)+1):
for index in range(len(s)+1-size):
slc = iterable[index:index+size]
if slc not in seen:
seen.add(slc)
yield slc
That code is a little more unwieldy and could probably be optimized for indentation, but it will do for a proof of concept. 该代码有点笨拙,可能会针对缩进进行优化,但它可以用于概念验证。
Would this do? 这会吗?
import itertools
def substrings(x):
for i, j in itertools.combinations(xrange(len(x)+1), 2):
yield x[i:j]
or as generator expression: 或作为生成器表达式:
(x[i:j] for i, j in itertools.combinations(xrange(len(x)+1), 2))
The expanded result for your example looks like this: 您的示例的展开结果如下所示:
['a', 'ab', 'abc', 'abcd', 'b', 'bc', 'bcd', 'c', 'cd', 'd']
To order by length, use sort key=len
. 要按长度排序,请使用sort key=len
。
This is what you want: 这就是你想要的:
In [260]: S = 'abcd'
In [261]: list(itertools.chain.from_iterable([list(itertools.combinations(S,i)) for i in range(1,len(S))]))
Out[261]:
[('a',),
('b',),
('c',),
('d',),
('a', 'b'),
('a', 'c'),
('a', 'd'),
('b', 'c'),
('b', 'd'),
('c', 'd'),
('a', 'b', 'c'),
('a', 'b', 'd'),
('a', 'c', 'd'),
('b', 'c', 'd')]
Or if you really want them all as strings, you could do: 或者如果你真的希望它们都是字符串,你可以这样做:
In [262]: combos = list(itertools.chain.from_iterable([list(itertools.combinations(S,i)) for i in range(1,len(S))]))
In [263]: [''.join(c) for c in combos]
Out[263]:
['a',
'b',
'c',
'd',
'ab',
'ac',
'ad',
'bc',
'bd',
'cd',
'abc',
'abd',
'acd',
'bcd']
EDIT To get only substrings of S
: 编辑只获得S
子串 :
In [270]: list(itertools.chain.from_iterable([[S[i:i+k] for i in range(len(S)-k)] for k in range(1,len(S)+1)])) + [S]
Out[270]: ['a', 'b', 'c', 'ab', 'bc', 'abc', 'abcd']
I think this works too and while is not the most efficient, it has the attractive of using less complex features. 我认为这也有效,虽然不是最有效的,但它具有使用不太复杂的功能的吸引力。
S = "abcd"
substrings = [S[i:j] for i in range(len(S)) for j in range(i+1,len(S)+1)]
substrings.sort(key=len)
Note however that this approach does not remove identical substrings that might appear. 但请注意,此方法不会删除可能出现的相同子字符串。 For example if the original substring was "abcdab"
, a
, b
and ab
would appear twice. 例如,如果原始子字符串是"abcdab"
,则a
, b
和ab
将出现两次。
There are two questions there. 那里有两个问题。
The first, How many substrings can you make out of a string like “abcd”?
第一个, How many substrings can you make out of a string like “abcd”?
is a combinations like this: 是这样的组合:
import itertools
s='abcd'
com=[list(itertools.combinations(s,x)) for x in range(1,len(s)+1)]
print [''.join(e) for e in sum(com,[])]
prints: 打印:
['a', 'b', 'c', 'd', 'ab', 'ac', 'ad', 'bc', 'bd', 'cd', 'abc', 'abd', 'acd', 'bcd', 'abcd']
The second question is how to replicate your example (which is not a 'combination'). 第二个问题是如何复制你的例子(这不是'组合')。 You can do that with this code: 您可以使用以下代码执行此操作:
>>> [s[i:i+j] for j in range(1,len(s)+1) for i in range(len(s)-j+1)]
['a', 'b', 'c', 'd', 'ab', 'bc', 'cd', 'abc', 'bcd', 'abcd']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.