简体   繁体   中英

Python : Search patterns from file1 in a list of files

I have a file which contains all the list of patterns. The patterns are of this format (high:low). high and low separated by a colon.

Sample patterns in file1:

-6447867851221037056:-3788579599719006014

-6449238495544274944:-3932696454242172736

-6449265231715692544:-4004752252983770939

-6449203349826891776:-2995947018784538426

-6447968581089030144:-3659104829784325951

-6449244980944891904:-1059398397250633536

-6449247532155465728:-1915082300761767744

-6447984223359922176:-4220924871888797497

Now, I have a list of files in a search directory. I want to search the patterns from the above file in all the files in the search directory which matches file * yyyy-mm-dd-hh * pattern.

eg:

Search all the patterns from file1 in search directory with filename pattern * 2015-09-07-06 *

search_file.2015-09-07-05-45

search_file.2015-09-07-06-47

search_file.2015-09-07-06-48

search_file.2015-09-07-06-50

search_file.2015-09-07-06-52

I know how to do this in bash (grep -f file1 * 2015-09-07-06 * ). But I'm new to python and have very little clue of how to proceed further. Any kind of pointers is appreciated

Checkout this package: https://amoffat.github.io/sh/

It will enable you to run your shell command in python easily.

You can use pip to install it.

The efficient and easy way is with shell commands. Hope you will be knowing os.system for execute shell commands in python. I have tried the same logic in python.

    import glob
    file_list = glob.glob("*2015-09-07-06*")

    input_filename = "file1.txt"
    with open(input_filename,'r') as input:
        input_data = input.readlines()

    for each_file in file_list:
        open_file = open(each_file,'r')
        file_data = open_file.readlines()
        identified = set(input_data) & set(file_data)
        if identified:
            print "filename : " + str(each_file)+" : "+str(identified)
import re
def searchpattern():
    with open("file1.txt", 'r') as fobj:
        f = fobj.readlines()
        pattern = '2015-09-07-06'

        for i in f:
            m = re.search(pattern, i)
            if m:
                print(i)
            else:
                pass
if __name__ == '__main__':
    searchpattern()

Output:

2015-09-07-06-45
2015-09-07-06-46

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM