I am trying to execute a python script on all text files in a folder:
for fi in sys.argv[1:]:
And I get the following error
-bash: /usr/bin/python: Argument list too long
The way I call this Python function is the following:
python functionName.py *.txt
The folder has around 9000 files. Is there some way to run this function without having to split my data in more folders etc? Splitting the files would not be very practical because I will have to execute the function in even more files in the future... Thanks
EDIT: Based on the selected correct reply and the comments of the replier (Charles Duffy), what worked for me is the following:
printf '%s\0' *.txt | xargs -0 python ./functionName.py
because I don't have a valid shebang..
This is an OS-level problem (limit on command line length), and is conventionally solved with an OS-level (or, at least, outside-your-Python-process) solution:
find . -maxdepth 1 -type f -name '*.txt' -exec ./your-python-program '{}' +
...or...
printf '%s\0' *.txt | xargs -0 ./your-python-program
Note that this runs your-python-program
once per batch of files found, where the batch size is dependent on the number of names that can fit in ARG_MAX
; see the excellent answer by Marcus Müller if this is unsuitable.
No. That is a kernel limitation for the length (in bytes) of a command line.
Typically, you can determine that limit by doing
getconf ARG_MAX
which, at least for me, yields 2097152 (bytes), which means about 2MB.
I recommend using python to work through a folder yourself, ie giving your python program the ability to work with directories instead of individidual files, or to read file names from a file.
The former can easily be done using os.walk(...)
, whereas the second option is (in my opinion) the more flexible one. Use the argparse
module to give your python program an easy-to-use command line syntax, then add an argument of a file type (see reference documentation), and python will automatically be able to understand special filenames like -
, meaning you could instead of
for fi in sys.argv[1:]
do
for fi in opts.file_to_read_filenames_from.read().split(chr(0))
which would even allow you to do something like
find -iname '*.txt' -type f -print0|my_python_program.py -file-to-read-filenames-from -
Don't do it this way. Pass mask to your python script (eg call it as python functionName.py "*.txt"
) and expand it using glob ( https://docs.python.org/2/library/glob.html ).
I think about using glob
module. With this module you invoke your program like:
python functionName.py "*.txt"
then shell will not expand *.txt
into file names. You Python program will receive *.txt
in argumens list and you can pass it into glob.glob()
:
for fi in glob.glob(sys.argv[1]):
...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.