简体   繁体   English

处理多个文件时保存列表 python

[英]save the list when processing more than one files python

I have one python script to process several files in a folder, and the results are like this, there're two columns called user_agent and user_type我有一个 python 脚本来处理一个文件夹中的多个文件,结果是这样的,有两列分别称为 user_agent 和 user_type

firefox, pc
IE, pc
iPhone, mobile
....

since the results from the same or different files may be duplicated, I use a list to keep track of the unique combination.由于来自相同或不同文件的结果可能会重复,因此我使用一个列表来跟踪唯一的组合。

if (user_agent,user_type) not in lookuplist:
   lookuplist.append((user_agent,user_type))
   print user_agent,user_type

Now the problem is, since I have more than one raw data file to parse, how to "save" the lookuplist when one file is done, and when the second one starts, it still knows, for example, (firefox, pc) has already exist, then I won't have duplicated results现在的问题是,因为我有多个原始数据文件要解析,如何在一个文件完成时“保存”查找列表,而当第二个文件开始时,它仍然知道,例如,(firefox, pc) 有已经存在,那么我不会有重复的结果

Many thanks非常感谢

First, you should use a set and not a list for your lookuplist .首先,您应该为lookuplist使用一个set而不是一个列表。 Second, open all the files inside a loop and in the loop check for duplicates.其次,打开循环内的所有文件并在循环中检查重复项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM