How can i read specific files in a folder (files within a range)in Python

Question

For example, I have some 43000 txt files in my folder, however, I want to read not all the files but just some of them by giving in a range, like from 1.txt till 14400.txt`. How can I achieve this in Python? For now, I'm reading all the files in a directory like

for each in glob.glob("data/*.txt"):
    with open(each , 'r') as file:
        content = file.readlines()
        with open('{}.csv'.format(each[0:-4]) , 'w') as file:
            file.writelines(content)

Any way I can achieve the desired results?

Answer 1

If the files have numerically consecutive names starting with 1.txt , you can use range() to help construct the filenames:

for num in range(1, 14400):
    filename = "data/%d.txt" % num

Answer 2

Since glob.glob() returns an iterable , you can simply iterate over a certain section of the list using something like:

import glob

for each in glob.glob("*")[:5]:
    print(each)

Just use variable list boundaries and I think this achieves the results you are looking for.

Edit: Also, be sure that you are not trying to iterate over a list slice that is out of bounds, so perhaps a check for that prior might be in order.

Answer 3

I found a solution here: How to extract numbers from a string in Python?

import os
import re

filepath = './'

for filename in os.listdir():
    numbers_in_name = re.findall('\d',filename)
    if (numbers_in_name != [] and int(numbers_in_name[0]) < 5 ) :
        print(os.path.join(filepath,filename))
        #do other stuff with the filenames

You can use re to get the numbers in the filename. This prints all filenames where the first number is smaller than 5 for example.

How can i read specific files in a folder (files within a range)in Python

Question

3 answers

solution1
0 2020-07-10 14:31:48

solution2
0 2020-07-10 14:41:32

solution3
0 2020-07-10 14:48:30

How can i read specific files in a folder (files within a range)in Python

Question

3 answers

solution1 0 2020-07-10 14:31:48

solution2 0 2020-07-10 14:41:32

solution3 0 2020-07-10 14:48:30

solution1
0 2020-07-10 14:31:48

solution2
0 2020-07-10 14:41:32

solution3
0 2020-07-10 14:48:30