What is better way to get difference of two lists?

Question

There is in one directory where every time new files are generated, like some log files.

My purpose is to get an amount of file generated during 10 mins. To get such value real time.data is as follow:

00:00 ~ 00:10        10 files

00:10 ~ 00:20        23 files

...

23:50 ~ 23:59        12 files

So my idea is to run statistics script every 10 mins by crontab task on Linux system. Logic the 1st time run script: get current file list by glob.glob("*") .

Let me say A, so when script run next time (after 10 mins), it will run glob again to get current file list B. I need different value which in B. no A. so I can get amount. How to do? If you have another good way, please share.

Answer 1

You want to look into sets . You can do something like:

setA = set(listA)
setB = set(listB)
new_list = list(setB - setA)

You can also do additional set logic to identify files that are deleted and such.

Answer 2

As I commented on @tcaswell's answer , using Python's built-in set class is an excellent way to solve a problem like this. Here's some sample code loosely based on Tim Golden's Python Stuff article Watch a Directory for Changes :

import os

firstime = False
path_to_watch = '.'

try:
    with open('filelist.txt', 'rt') as filelist:
        before = set(line.strip() for line in filelist)
except IOError:
    before = set(os.listdir(path_to_watch))
    firstime = True

if firstime:
    after = before
else:
    after = set(os.listdir(path_to_watch))
    added = after-before
    removed = before-after
    if added:
        print 'Added: ', ', '.join(added)
    if removed:
        print 'Removed: ', ', '.join(removed)

# replace/create filelist
with open('filelist.txt', 'wt') as filelist:
    filelist.write('\n'.join(after) + '\n')

What is better way to get difference of two lists?

Question

2 answers

solution1
3 ACCPTED 2012-11-16 16:31:51

solution2
0 2012-11-16 17:58:19

What is better way to get difference of two lists?

Question

2 answers

solution1 3 ACCPTED 2012-11-16 16:31:51

solution2 0 2012-11-16 17:58:19

solution1
3 ACCPTED 2012-11-16 16:31:51

solution2
0 2012-11-16 17:58:19