简体   繁体   English

Python中的字符串操作(单词的所有大小写派生词)

[英]String manipulation in Python (All upper and lower case derivatives of a word)

I have to read in a word, and create an array which stores every variation of upper and lower case characters for this word in an array. 我必须读一个单词,然后创建一个数组,该数组在数组中存储此单词的大小写字符的每个变体。 For example with the word "abc". 例如,单词“ abc”。 I need to find a way to retrieve every upper and lower case version of "abc" (abc, Abc, ABc, ABC, AbC, abC, and aBC). 我需要找到一种方法来检索“ abc”的每个大写和小写版本(abc,Abc,ABc,ABC,AbC,abc和aBC)。 The string may also include numbers which should be left alone. 该字符串还可以包含数字,应单独保留。

I know I would have to use recursion here in order to get every variation, but I'm just not quite sure how, or if there are any python libraries that provide this kind of operation. 我知道我必须在这里使用递归以获得所有变化,但是我不确定如何,或者是否有任何提供这种操作的python库。

Any help or tips is greatly appreciated! 任何帮助或提示,我们将不胜感激!

from itertools import product
def randString(istr):
    l = [(c, c.upper()) if not c.isdigit() else (c,) for c in istr.lower()]
    return ["".join(item) for item in product(*l)]

print randString("aBC1")
print randString("A1b2c3")

Output 输出量

['abc1', 'abC1', 'aBc1', 'aBC1', 'Abc1', 'AbC1', 'ABc1', 'ABC1']
['a1b2c3', 'a1b2C3', 'a1B2c3', 'a1B2C3', 'A1b2c3', 'A1b2C3', 'A1B2c3', 'A1B2C3']

You can solve this using a Cartesian product . 您可以使用笛卡尔积来解决此问题。 Given the string 'abc' , you'll want to split it into a list of possibilities in each position, eg: 给定字符串'abc' ,您将希望将其分成每个位置的可能性列表,例如:

['Aa', 'Bb', 'Cc']

I'll leave that to you, as it should be pretty easy. 我会把它留给你,因为它应该很容易。 Once you've got that, you can use itertools.product to make all the combinations. 一旦知道了,就可以使用itertools.product进行所有组合。 You'll get an iterable of lists like 您将获得像这样的列表的迭代

['A', 'b', 'C']

You can then use ''.join to join those lists together, getting your desired strings. 然后,您可以使用''.join将这些列表连接在一起,以获得所需的字符串。

You can use product like this. 您可以使用这样的product The trick is to use sets to manage any characters that don't have distinct upper and lower versions (eg digits). 诀窍是使用集合来管理任何没有明显的上,下版本(例如数字)的字符。

>>> from itertools import product
>>> [''.join(x) for x in product(*[{c.upper(), c.lower()} for c in "abc"])]
['ABC', 'ABc', 'AbC', 'Abc', 'aBC', 'aBc', 'abC', 'abc']
>>> [''.join(x) for x in product(*[{(c.upper(), c.lower()} for c in "abc1"])]
['ABC1', 'ABc1', 'AbC1', 'Abc1', 'aBC1', 'aBc1', 'abC1', 'abc1']

so 所以

from itertools import product
def randString(s):
    return [[''.join(x) for x in product(*[{c.upper(), c.lower()} for c in s])]

You can make the output more consistent by shifting the .lower() 您可以通过移动.lower()使输出更一致

from itertools import product
def randString(s):
    return [[''.join(x) for x in product(*[{c.upper(), c} for c in s.lower()])]

Here is a solution using the good ol' recursion: 这是使用良好的ol'递归的解决方案:

def get_all_variations(word):
    if len(word) == 1:
        #a single character has two variations. e.g. a -> [a, A]
        return [word, word.upper()]
    else:
        #otherwise, call recursively using the left and the right half, and merge results.
        word_mid_point = len(word) // 2
        left_vars = get_all_variations(word[:word_mid_point])
        right_vars = get_all_variations(word[word_mid_point:])
        variations = []
        for left_var in left_vars:
            for right_var in right_vars:
                variations.append(left_var + right_var)
        return variations

and

>>> get_all_variations("abc")
['abc', 'abC', 'aBc', 'aBC', 'Abc', 'AbC', 'ABc', 'ABC']

如果这不是家庭作业,则可以:

print(list(itertools.combinations('abcABC', 3)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM