[英]How to get elements of a collection that are not in another collection?
I am a Java programmer, but there is a task, where the better way to resolve it is to use (the more effective and suitable for the server) python, which is not familiar to me.我是 Java 程序员,但是有一个任务,解决它的更好方法是使用(更有效和适合服务器)python,这对我来说并不熟悉。
What about task?任务呢? I have file, which contains sorted ids (~5 mln ids) in such format:
我有文件,其中包含以下格式的排序 ID(~500 万个 ID):
00000011-1f0e-4d89-b658-af53b36c882e
0000008a-5816-4324-82f6-9242a8867094
000000be-d08c-41b9-97f3-594d2660dfb5
000000f2-ea63-48c0-98f6-1dbb25f0249e
0000014d-f6b0-4b3e-b767-14cd2495fd81
00000155-ec3b-4d1a-a3ae-28e95cfc79c7
00000231-65f9-424a-bf03-1d3cbefc6c40
00000281-cb21-4d3c-ba13-874161962567
000002be-6e9d-455d-aa16-49e2ac242868
00000375-4d9a-4dd6-8e0c-38e5c2134a3c
00000383-fc20-4154-921c-c187bb3f6628
000003fc-7a06-4525-a12a-df64732324e5
00000420-af64-4015-9bc4-6b9e18b86183
00000476-1bf9-4608-8979-d60ecd5b368b
...
Also I have another file, which contains ~60 mln sorted ids.我还有另一个文件,其中包含约 6000 万个排序的 ID。 The format is the same.
格式是一样的。
I need to read all ids from the first file to variable for example l1
and read all ids from the second file to variable for example l2
.我需要将第一个文件中的所有 id 读取到变量,例如
l1
,并将第二个文件中的所有 id 读取到变量,例如l2
。 After that I want to find all elements of the l1
, which are absent in l2
and write them to the third file.之后,我想找到
l2
中不存在的l1
的所有元素,并将它们写入第三个文件。 The first files are many, that is why I must repeat these actions from time to time.第一个文件很多,这就是为什么我必须不时重复这些操作。
Tell me, please, what is the best way to choose for solving this problem, which object types to use for l1
and l2
(the lists of ids are sorted) and what will the python script look like all in all?请告诉我,选择解决此问题的最佳方法是什么,
l1
和l2
使用哪些 object 类型(id 列表已排序)以及 python 脚本看起来像什么?
Element in first set, not in second set:第一组中的元素,而不是第二组中的元素:
s1=set([1,2,3,4,5])
s2=set([3,4,5,6,7])
s3=s1-s2
print(s3)
For this file merge scenario, you can google for a better algorithm to resolve it.对于这种文件合并的场景,可以google一下更好的算法来解决。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.