简体   繁体   English

从列表中创建2元组

[英]Create 2-tuples from a list

I need to generate 2-tuples from a list in Python, such that, in a tuple (a,b) , a!=b and if a tuple (a,b) has already been generated, skip generating (b,a) . 我需要从Python的列表中生成2元组,这样,在元组(a,b)a!=b ,如果已经生成了元组(a,b) ,则跳过生成(b,a)

Here is something I have written. 这是我写的东西。 It serves the purpose. 它有助于达到目的。

However, when it is run in a pandas dataframe , it takes some good time to run. 但是,当它在pandas dataframe中运行时,运行需要一些时间。

def tuplize(word_list):
    tuple_list = []
    if len(word_list) == 1:
        return None
    else:
        for i in range(len(word_list)):
            for j in range(i+1, len(word_list)):
                a = tuple([word_list[i], word_list[j]])
                tuple_list.append(a)
        return tuple_list

I would like to know if there is a faster way to attack the problem. 我想知道是否有更快的方法来解决问题。

Thanks in advance!! 提前致谢!!

Update: 更新:

I tried the solution by @ThalishSajeed. 我尝试了@ThalishSajeed的解决方案。 I enclosed it in a function and the function works perfect on a list of words as input. 我将它包含在一个函数中,并且该函数在一个单词列表中作为输入完美地工作。 However, when I try applying it on a pandas Series containing list of words. 但是,当我尝试将它应用于包含单词列表的pandas Series

This was my function, 这是我的功能,

def tuplize_faster(word_list):
    if len(word_list) <= 1:
        return None
    else:
        ret_object = itertools.combinations(word_list, 2)
        return [tuple(i) for i in ret_object]

and the result I got on passing a single list ( tuplize_faster(['Zero', 'rating', 'worst', 'service']) ) is, 我通过一个列表( tuplize_faster(['Zero', 'rating', 'worst', 'service']) )得到的结果是,

[('Zero', 'rating'),
 ('Zero', 'worst'),
 ('Zero', 'service'),
 ('rating', 'worst'),
 ('rating', 'service'),
 ('worst', 'service')]

Applying the same function on a pandas Series containing list of words 在包含单词列表的pandas Series上应用相同的功能

df_preprocessed['tuples'] = df_preprocessed.lemma_corrected.apply(lambda x: tuplize_faster(x)) 

gives this result. 给出了这个结果。

 [('[', "'"),
 ('[', 'Z'),
 ('[', 'e'),
 ('[', 'r'),
 ('[', 'o'),
 ('[', "'"),
 ('[', ','),
 ('[', ' '),
 ('[', "'"),
 ('[', 'r'),
 ('[', 'a'),
 ('[', 't'),
 ('[', 'i'),
 ('[', 'n'),
 ('[', 't'),
 ('[', "'"),
 ('[', ','),
 ('[', ' '),
 ('[', "'"),
 ('[', 'w'),
 ('[', 'o'),
 ('[', 'r'),
 ('[', 's'),
 ('[', 't'),
 ('[', "'"),
 ('[', ','),
 ('[', ' '),
 ('[', "'"),
 ('[', 's'),
 ('[', 'e'),
 ('[', 'r'),
 ('[', 'v'),
 ('[', 'i'),
 ('[', 'c'),
 ('[', 'e'),
 ('[', "'"),
 ('[', ']'),
 ("'", 'Z'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'o'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 'r'),
 ("'", 'a'),
 ("'", 't'),
 ("'", 'i'),
 ("'", 'n'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 'w'),
 ("'", 'o'),
 ("'", 'r'),
 ("'", 's'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 's'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'v'),
 ("'", 'i'),
 ("'", 'c'),
 ("'", 'e'),
 ("'", "'"),
 ("'", ']'),
 ('Z', 'e'),
 ('Z', 'r'),
 ('Z', 'o'),
 ('Z', "'"),
 ('Z', ','),
 ('Z', ' '),
 ('Z', "'"),
 ('Z', 'r'),
 ('Z', 'a'),
 ('Z', 't'),
 ('Z', 'i'),
 ('Z', 'n'),
 ('Z', 't'),
 ('Z', "'"),
 ('Z', ','),
 ('Z', ' '),
 ('Z', "'"),
 ('Z', 'w'),
 ('Z', 'o'),
 ('Z', 'r'),
 ('Z', 's'),
 ('Z', 't'),
 ('Z', "'"),
 ('Z', ','),
 ('Z', ' '),
 ('Z', "'"),
 ('Z', 's'),
 ('Z', 'e'),
 ('Z', 'r'),
 ('Z', 'v'),
 ('Z', 'i'),
 ('Z', 'c'),
 ('Z', 'e'),
 ('Z', "'"),
 ('Z', ']'),
 ('e', 'r'),
 ('e', 'o'),
 ('e', "'"),
 ('e', ','),
 ('e', ' '),
 ('e', "'"),
 ('e', 'r'),
 ('e', 'a'),
 ('e', 't'),
 ('e', 'i'),
 ('e', 'n'),
 ('e', 't'),
 ('e', "'"),
 ('e', ','),
 ('e', ' '),
 ('e', "'"),
 ('e', 'w'),
 ('e', 'o'),
 ('e', 'r'),
 ('e', 's'),
 ('e', 't'),
 ('e', "'"),
 ('e', ','),
 ('e', ' '),
 ('e', "'"),
 ('e', 's'),
 ('e', 'e'),
 ('e', 'r'),
 ('e', 'v'),
 ('e', 'i'),
 ('e', 'c'),
 ('e', 'e'),
 ('e', "'"),
 ('e', ']'),
 ('r', 'o'),
 ('r', "'"),
 ('r', ','),
 ('r', ' '),
 ('r', "'"),
 ('r', 'r'),
 ('r', 'a'),
 ('r', 't'),
 ('r', 'i'),
 ('r', 'n'),
 ('r', 't'),
 ('r', "'"),
 ('r', ','),
 ('r', ' '),
 ('r', "'"),
 ('r', 'w'),
 ('r', 'o'),
 ('r', 'r'),
 ('r', 's'),
 ('r', 't'),
 ('r', "'"),
 ('r', ','),
 ('r', ' '),
 ('r', "'"),
 ('r', 's'),
 ('r', 'e'),
 ('r', 'r'),
 ('r', 'v'),
 ('r', 'i'),
 ('r', 'c'),
 ('r', 'e'),
 ('r', "'"),
 ('r', ']'),
 ('o', "'"),
 ('o', ','),
 ('o', ' '),
 ('o', "'"),
 ('o', 'r'),
 ('o', 'a'),
 ('o', 't'),
 ('o', 'i'),
 ('o', 'n'),
 ('o', 't'),
 ('o', "'"),
 ('o', ','),
 ('o', ' '),
 ('o', "'"),
 ('o', 'w'),
 ('o', 'o'),
 ('o', 'r'),
 ('o', 's'),
 ('o', 't'),
 ('o', "'"),
 ('o', ','),
 ('o', ' '),
 ('o', "'"),
 ('o', 's'),
 ('o', 'e'),
 ('o', 'r'),
 ('o', 'v'),
 ('o', 'i'),
 ('o', 'c'),
 ('o', 'e'),
 ('o', "'"),
 ('o', ']'),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 'r'),
 ("'", 'a'),
 ("'", 't'),
 ("'", 'i'),
 ("'", 'n'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 'w'),
 ("'", 'o'),
 ("'", 'r'),
 ("'", 's'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 's'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'v'),
 ("'", 'i'),
 ("'", 'c'),
 ("'", 'e'),
 ("'", "'"),
 ("'", ']'),
 (',', ' '),
 (',', "'"),
 (',', 'r'),
 (',', 'a'),
 (',', 't'),
 (',', 'i'),
 (',', 'n'),
 (',', 't'),
 (',', "'"),
 (',', ','),
 (',', ' '),
 (',', "'"),
 (',', 'w'),
 (',', 'o'),
 (',', 'r'),
 (',', 's'),
 (',', 't'),
 (',', "'"),
 (',', ','),
 (',', ' '),
 (',', "'"),
 (',', 's'),
 (',', 'e'),
 (',', 'r'),
 (',', 'v'),
 (',', 'i'),
 (',', 'c'),
 (',', 'e'),
 (',', "'"),
 (',', ']'),
 (' ', "'"),
 (' ', 'r'),
 (' ', 'a'),
 (' ', 't'),
 (' ', 'i'),
 (' ', 'n'),
 (' ', 't'),
 (' ', "'"),
 (' ', ','),
 (' ', ' '),
 (' ', "'"),
 (' ', 'w'),
 (' ', 'o'),
 (' ', 'r'),
 (' ', 's'),
 (' ', 't'),
 (' ', "'"),
 (' ', ','),
 (' ', ' '),
 (' ', "'"),
 (' ', 's'),
 (' ', 'e'),
 (' ', 'r'),
 (' ', 'v'),
 (' ', 'i'),
 (' ', 'c'),
 (' ', 'e'),
 (' ', "'"),
 (' ', ']'),
 ("'", 'r'),
 ("'", 'a'),
 ("'", 't'),
 ("'", 'i'),
 ("'", 'n'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 'w'),
 ("'", 'o'),
 ("'", 'r'),
 ("'", 's'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 's'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'v'),
 ("'", 'i'),
 ("'", 'c'),
 ("'", 'e'),
 ("'", "'"),
 ("'", ']'),
 ('r', 'a'),
 ('r', 't'),
 ('r', 'i'),
 ('r', 'n'),
 ('r', 't'),
 ('r', "'"),
 ('r', ','),
 ('r', ' '),
 ('r', "'"),
 ('r', 'w'),
 ('r', 'o'),
 ('r', 'r'),
 ('r', 's'),
 ('r', 't'),
 ('r', "'"),
 ('r', ','),
 ('r', ' '),
 ('r', "'"),
 ('r', 's'),
 ('r', 'e'),
 ('r', 'r'),
 ('r', 'v'),
 ('r', 'i'),
 ('r', 'c'),
 ('r', 'e'),
 ('r', "'"),
 ('r', ']'),
 ('a', 't'),
 ('a', 'i'),
 ('a', 'n'),
 ('a', 't'),
 ('a', "'"),
 ('a', ','),
 ('a', ' '),
 ('a', "'"),
 ('a', 'w'),
 ('a', 'o'),
 ('a', 'r'),
 ('a', 's'),
 ('a', 't'),
 ('a', "'"),
 ('a', ','),
 ('a', ' '),
 ('a', "'"),
 ('a', 's'),
 ('a', 'e'),
 ('a', 'r'),
 ('a', 'v'),
 ('a', 'i'),
 ('a', 'c'),
 ('a', 'e'),
 ('a', "'"),
 ('a', ']'),
 ('t', 'i'),
 ('t', 'n'),
 ('t', 't'),
 ('t', "'"),
 ('t', ','),
 ('t', ' '),
 ('t', "'"),
 ('t', 'w'),
 ('t', 'o'),
 ('t', 'r'),
 ('t', 's'),
 ('t', 't'),
 ('t', "'"),
 ('t', ','),
 ('t', ' '),
 ('t', "'"),
 ('t', 's'),
 ('t', 'e'),
 ('t', 'r'),
 ('t', 'v'),
 ('t', 'i'),
 ('t', 'c'),
 ('t', 'e'),
 ('t', "'"),
 ('t', ']'),
 ('i', 'n'),
 ('i', 't'),
 ('i', "'"),
 ('i', ','),
 ('i', ' '),
 ('i', "'"),
 ('i', 'w'),
 ('i', 'o'),
 ('i', 'r'),
 ('i', 's'),
 ('i', 't'),
 ('i', "'"),
 ('i', ','),
 ('i', ' '),
 ('i', "'"),
 ('i', 's'),
 ('i', 'e'),
 ('i', 'r'),
 ('i', 'v'),
 ('i', 'i'),
 ('i', 'c'),
 ('i', 'e'),
 ('i', "'"),
 ('i', ']'),
 ('n', 't'),
 ('n', "'"),
 ('n', ','),
 ('n', ' '),
 ('n', "'"),
 ('n', 'w'),
 ('n', 'o'),
 ('n', 'r'),
 ('n', 's'),
 ('n', 't'),
 ('n', "'"),
 ('n', ','),
 ('n', ' '),
 ('n', "'"),
 ('n', 's'),
 ('n', 'e'),
 ('n', 'r'),
 ('n', 'v'),
 ('n', 'i'),
 ('n', 'c'),
 ('n', 'e'),
 ('n', "'"),
 ('n', ']'),
 ('t', "'"),
 ('t', ','),
 ('t', ' '),
 ('t', "'"),
 ('t', 'w'),
 ('t', 'o'),
 ('t', 'r'),
 ('t', 's'),
 ('t', 't'),
 ('t', "'"),
 ('t', ','),
 ('t', ' '),
 ('t', "'"),
 ('t', 's'),
 ('t', 'e'),
 ('t', 'r'),
 ('t', 'v'),
 ('t', 'i'),
 ('t', 'c'),
 ('t', 'e'),
 ('t', "'"),
 ('t', ']'),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 'w'),
 ("'", 'o'),
 ("'", 'r'),
 ("'", 's'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 's'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'v'),
 ("'", 'i'),
 ("'", 'c'),
 ("'", 'e'),
 ("'", "'"),
 ("'", ']'),
 (',', ' '),
 (',', "'"),
 (',', 'w'),
 (',', 'o'),
 (',', 'r'),
 (',', 's'),
 (',', 't'),
 (',', "'"),
 (',', ','),
 (',', ' '),
 (',', "'"),
 (',', 's'),
 (',', 'e'),
 (',', 'r'),
 (',', 'v'),
 (',', 'i'),
 (',', 'c'),
 (',', 'e'),
 (',', "'"),
 (',', ']'),
 (' ', "'"),
 (' ', 'w'),
 (' ', 'o'),
 (' ', 'r'),
 (' ', 's'),
 (' ', 't'),
 (' ', "'"),
 (' ', ','),
 (' ', ' '),
 (' ', "'"),
 (' ', 's'),
 (' ', 'e'),
 (' ', 'r'),
 (' ', 'v'),
 (' ', 'i'),
 (' ', 'c'),
 (' ', 'e'),
 (' ', "'"),
 (' ', ']'),
 ("'", 'w'),
 ("'", 'o'),
 ("'", 'r'),
 ("'", 's'),
 ("'", 't'),
 ("'", "'"),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 's'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'v'),
 ("'", 'i'),
 ("'", 'c'),
 ("'", 'e'),
 ("'", "'"),
 ("'", ']'),
 ('w', 'o'),
 ('w', 'r'),
 ('w', 's'),
 ('w', 't'),
 ('w', "'"),
 ('w', ','),
 ('w', ' '),
 ('w', "'"),
 ('w', 's'),
 ('w', 'e'),
 ('w', 'r'),
 ('w', 'v'),
 ('w', 'i'),
 ('w', 'c'),
 ('w', 'e'),
 ('w', "'"),
 ('w', ']'),
 ('o', 'r'),
 ('o', 's'),
 ('o', 't'),
 ('o', "'"),
 ('o', ','),
 ('o', ' '),
 ('o', "'"),
 ('o', 's'),
 ('o', 'e'),
 ('o', 'r'),
 ('o', 'v'),
 ('o', 'i'),
 ('o', 'c'),
 ('o', 'e'),
 ('o', "'"),
 ('o', ']'),
 ('r', 's'),
 ('r', 't'),
 ('r', "'"),
 ('r', ','),
 ('r', ' '),
 ('r', "'"),
 ('r', 's'),
 ('r', 'e'),
 ('r', 'r'),
 ('r', 'v'),
 ('r', 'i'),
 ('r', 'c'),
 ('r', 'e'),
 ('r', "'"),
 ('r', ']'),
 ('s', 't'),
 ('s', "'"),
 ('s', ','),
 ('s', ' '),
 ('s', "'"),
 ('s', 's'),
 ('s', 'e'),
 ('s', 'r'),
 ('s', 'v'),
 ('s', 'i'),
 ('s', 'c'),
 ('s', 'e'),
 ('s', "'"),
 ('s', ']'),
 ('t', "'"),
 ('t', ','),
 ('t', ' '),
 ('t', "'"),
 ('t', 's'),
 ('t', 'e'),
 ('t', 'r'),
 ('t', 'v'),
 ('t', 'i'),
 ('t', 'c'),
 ('t', 'e'),
 ('t', "'"),
 ('t', ']'),
 ("'", ','),
 ("'", ' '),
 ("'", "'"),
 ("'", 's'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'v'),
 ("'", 'i'),
 ("'", 'c'),
 ("'", 'e'),
 ("'", "'"),
 ("'", ']'),
 (',', ' '),
 (',', "'"),
 (',', 's'),
 (',', 'e'),
 (',', 'r'),
 (',', 'v'),
 (',', 'i'),
 (',', 'c'),
 (',', 'e'),
 (',', "'"),
 (',', ']'),
 (' ', "'"),
 (' ', 's'),
 (' ', 'e'),
 (' ', 'r'),
 (' ', 'v'),
 (' ', 'i'),
 (' ', 'c'),
 (' ', 'e'),
 (' ', "'"),
 (' ', ']'),
 ("'", 's'),
 ("'", 'e'),
 ("'", 'r'),
 ("'", 'v'),
 ("'", 'i'),
 ("'", 'c'),
 ("'", 'e'),
 ("'", "'"),
 ("'", ']'),
 ('s', 'e'),
 ('s', 'r'),
 ('s', 'v'),
 ('s', 'i'),
 ('s', 'c'),
 ('s', 'e'),
 ('s', "'"),
 ('s', ']'),
 ('e', 'r'),
 ('e', 'v'),
 ('e', 'i'),
 ('e', 'c'),
 ('e', 'e'),
 ('e', "'"),
 ('e', ']'),
 ('r', 'v'),
 ('r', 'i'),
 ('r', 'c'),
 ('r', 'e'),
 ('r', "'"),
 ('r', ']'),
 ('v', 'i'),
 ('v', 'c'),
 ('v', 'e'),
 ('v', "'"),
 ('v', ']'),
 ('i', 'c'),
 ('i', 'e'),
 ('i', "'"),
 ('i', ']'),
 ('c', 'e'),
 ('c', "'"),
 ('c', ']'),
 ('e', "'"),
 ('e', ']'),
 ("'", ']')]

Is there something wrong with the way I have used apply ? 我使用apply的方式有问题吗?

Have you tried itertools? 你试过itertools吗? link to documentation 链接到文档

retObject = itertools.combinations(word_list, 2)

since you want 2 tuples 因为你想要2个元组

edits - To return the list 编辑 - 返回列表

s = [tuple(i) for i in retObject]

Edited to show that the approach works for a Pandas Series. 编辑表明该方法适用于熊猫系列。

a = pd.Series(['Zero', 'rating', 'worst', 'service'])
tuplize_faster(a)
output 产量
[('Zero', 'rating'), ('Zero', 'worst'), ('Zero', 'service'), ('rating', 'worst'), ('rating', 'service'), ('worst', 'service')]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM