简体   繁体   English

比较Python中常见条目的列表的动态数量(非等长)

[英]comparing dynamic number of lists (non-equal length) for common entries in Python

I'm trying to compare lists of different number and length that have been generated dynamically by user input and pattern matching. 我正在尝试比较由用户输入和模式匹配动态生成的不同数量和长度的列表。 I haven't included all the matching code, but you should get the idea of what I'm trying to do. 我没有包含所有匹配的代码,但是您应该了解我正在尝试做的事情。

Following suggestions from another Stack Overflow post, I've used a 'list of lists'. 根据另一个Stack Overflow帖子的建议,我使用了“列表列表”。 I've used the number of queries inputted by the user to name lists and access them. 我已经使用了用户输入的查询数量来命名列表并访问它们。

At the end of the program I want to do some comparison between the lists, but I can't get my head around how to do this. 在程序结束时,我想在列表之间做一些比较,但是我不知道该怎么做。 To start, I'd just like to compare list elements and find those that match in all of the lists, however I'd also like to perform other list comparisons at a later date. 首先,我只想比较列表元素并找到所有列表中匹配的元素,但是我也想稍后再进行其他列表比较。 I just can't figure out how to access individual lists once I'm outside of the 'for query in dom_queries' loop. 一旦我不在dom_queries中的“ for查询”循环中,就无法弄清楚如何访问单个列表。

I'm super stuck and woul really apreciate some help!! 我超级困住,真的会得到一些帮助!!

Thanks, 谢谢,

# set dom_count and initialise query_list
dom_count = 0
dom_queries = [] 
# get the number of query domains
domain_number = raw_input('How many domains do you want to find intersects for? ')
# Grab query ID's
while dom_count < int(domain_number):
 dom_count += 1
 query_domain  = raw_input('domain ID query ' + str(dom_count) + ': ')
 dom_queries.append(query_domain)

# initialise lists for query_matches
list_of_lists = []
for i in range(len(dom_queries)):
 list_of_lists.append( [] )
list_pos = 0

# do some matching here for each dom_query, incrementing list position for each query 
# and put matches into the list
for query in dom_queries:
 some_match = re.search(r'XYZ',some_line)
 list_of_lists[int(list_pos)].append(some_match.group())
 list_pos += 1

# HERE IS WHERE I'M STUCK!!!
# I would like to compare all list's generated and find list entries 
# that exist in each list (can be any number of lists with different lengths).

for i in range (len(dom_queries)):
 common = list(set(list_of_lists[i] & .... \/^.^\/  ??

From all your lists you can create one set that will contain all the items that are present in all lists with the function intersection() This works starting with Python 2.6 and you'll have to covnert the lists to sets first. 您可以从所有列表中创建一个集合,其中将包含所有带有函数intersection()的列表中存在的所有项。此函数从Python 2.6开始运行,您必须首先将列表转换为集合。

http://docs.python.org/2/library/stdtypes.html#set.intersection http://docs.python.org/2/library/stdtypes.html#set.intersection

First, just a simplification. 首先,只是一个简化。 You can use a list comprehension to create the empty list of lists (just a bit more Pythonic). 您可以使用列表推导来创建列表的空列表(只是Pythonic多一点)。 Also, let's make it a list of sets instead of a list of lists. 另外,让我们将其设为集合列表而不是列表列表。

list_of_sets = [set() for i in range(domain_number)]

Then we can do something like this: 然后我们可以做这样的事情:

common_set = set()
for i, s in enumerate(list_of_sets):
    if i == domain_number - 1:
        break
    common_set = common_set.update(s.intersection(list_of_sets[i+1])

So, you start with an empty set and then for each of the sets in the list, you find its intersection with the next set in the list (intersection: all the shared items between the two). 因此,从一个空集合开始,然后对于列表中的每个集合,在列表中找到其与下一个集合的交集(交集:两者之间的所有共享项)。 You then use update to merge that intersection set into your set of common elements. 然后,您可以使用update将那个相交集合并到一组公共元素中。 Later if you want to manually add an item to the common set you would use the add method. 以后,如果要手动将项目添加到通用集,则可以使用add方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM