简体   繁体   English

如何获取一个集合中不在另一个集合中的元素?

[英]How to get elements of a collection that are not in another collection?

I am a Java programmer, but there is a task, where the better way to resolve it is to use (the more effective and suitable for the server) python, which is not familiar to me.我是 Java 程序员,但是有一个任务,解决它的更好方法是使用(更有效和适合服务器)python,这对我来说并不熟悉。

What about task?任务呢? I have file, which contains sorted ids (~5 mln ids) in such format:我有文件,其中包含以下格式的排序 ID(~500 万个 ID):

00000011-1f0e-4d89-b658-af53b36c882e
0000008a-5816-4324-82f6-9242a8867094
000000be-d08c-41b9-97f3-594d2660dfb5
000000f2-ea63-48c0-98f6-1dbb25f0249e
0000014d-f6b0-4b3e-b767-14cd2495fd81
00000155-ec3b-4d1a-a3ae-28e95cfc79c7
00000231-65f9-424a-bf03-1d3cbefc6c40
00000281-cb21-4d3c-ba13-874161962567
000002be-6e9d-455d-aa16-49e2ac242868
00000375-4d9a-4dd6-8e0c-38e5c2134a3c
00000383-fc20-4154-921c-c187bb3f6628
000003fc-7a06-4525-a12a-df64732324e5
00000420-af64-4015-9bc4-6b9e18b86183
00000476-1bf9-4608-8979-d60ecd5b368b
...

Also I have another file, which contains ~60 mln sorted ids.我还有另一个文件,其中包含约 6000 万个排序的 ID。 The format is the same.格式是一样的。

I need to read all ids from the first file to variable for example l1 and read all ids from the second file to variable for example l2 .我需要将第一个文件中的所有 id 读取到变量,例如l1 ,并将第二个文件中的所有 id 读取到变量,例如l2 After that I want to find all elements of the l1 , which are absent in l2 and write them to the third file.之后,我想找到l2中不存在的l1的所有元素,并将它们写入第三个文件。 The first files are many, that is why I must repeat these actions from time to time.第一个文件很多,这就是为什么我必须不时重复这些操作。

Tell me, please, what is the best way to choose for solving this problem, which object types to use for l1 and l2 (the lists of ids are sorted) and what will the python script look like all in all?请告诉我,选择解决此问题的最佳方法是什么, l1l2使用哪些 object 类型(id 列表已排序)以及 python 脚本看起来像什么?

Element in first set, not in second set:第一组中的元素,而不是第二组中的元素:

s1=set([1,2,3,4,5])
s2=set([3,4,5,6,7])
s3=s1-s2
print(s3)

For this file merge scenario, you can google for a better algorithm to resolve it.对于这种文件合并的场景,可以google一下更好的算法来解决。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何基于Python中的另一个集合对集合进行排序? - How to sort a collection based on another collection in Python? 检查集合是否仅包含其他集合中的元素的最佳方法是什么? - Best way to check if a collection only contains elements in another collection? 如何将函数应用于元素集合 - How to apply a function to a collection of elements 我怎样才能获得一系列的收藏品 - How can I get a collection of collection 如何从文档不存在于另一个集合中的集合中获取所有用户 ID - How to get all user ids from a collection where the document not present in another collection 如何获取包含另一个集合中的键的文档计数 - How to get a count of documents that contain keys from another collection 如何在另一个集合中创建文档时引用集合中的现有文档 MongoEngine - How to reference existing documents in collection on creation of documents in another collection MongoEngine 如何获取文档中集合的名称 - How to get the name of the collection in the document 修改一些独立于集合类型的集合元素 - Modify some collection elements independent of collection type 使用另一行集合裁剪线条集合 - Crop a collection of lines using another collection of lines
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM