简体   繁体   中英

How to carry in a list the name each file from a directory with python?

I am applying some regex to a folder full of .txt files in order to extract some specific patterns like this:

def retrive(directory, a_regex):
    for filename in glob.glob(os.path.join(directory, '*.txt')):
        with open(filename, 'r') as file:
            important_stuff = re.findall(a_regex, file.read(), re.S)
            my_list = [tuple([j.split()[0] for j in i]) for i in important_stuff]
            print my_list


lists_per_file = retrive(directory,regex_)

And the output is the desired content of all the files in a list:

[interesting stuff 1]
[interesting stuff 2]
[interesting stuff 3]
...
[interesting stuff n]
[interesting stuff n-1]

How can I carry or bind to the list the name of each document file, ie something like this:

[interesting stuff 1], name_of_document_1
[interesting stuff 2], name_of_document_2
[interesting stuff 3],name_of_document_3
...
[interesting stuff n], name_of_document_n
[interesting stuff n-1], name_of_document_n-1

Thanks in advance guys.

If you want to print the list and then the filename without a newline between the two, you will first have to turn the list into a string, then strip off the brackets from around the list. After that you can grab the filename from the filepath that you have, and put the two together.

See code below;

def retrive(directory, a_regex):
    for filename in glob.glob(os.path.join(directory, '*.txt')):
        with open(filename, 'r') as file:
            important_stuff = re.findall(a_regex, file.read(), re.S)
            my_list = [tuple([j.split()[0] for j in i]) for i in important_stuff]
            # print my_list # old line
            print str(my_list).strip('[]') + filename.split('/')[-1]


lists_per_file = retrive(directory,regex_)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM