Capturing output of find . -print0 into a bash array

Question

Using find . -print0 find . -print0 seems to be the only safe way of obtaining a list of files in bash due to the possibility of filenames containing spaces, newlines, quotation marks etc.

However, I'm having a hard time actually making find's output useful within bash or with other command line utilities. The only way I have managed to make use of the output is by piping it to perl, and changing perl's IFS to null:

find . -print0 | perl -e '$/="\0"; @files=<>; print $#files;'

This example prints the number of files found, avoiding the danger of newlines in filenames corrupting the count, as would occur with:

find . | wc -l

As most command line programs do not support null-delimited input, I figure the best thing would be to capture the output of find . -print0 find . -print0 in a bash array, like I have done in the perl snippet above, and then continue with the task, whatever it may be.

How can I do this?

This doesn't work:

find . -print0 | ( IFS=$'\0' ; array=( $( cat ) ) ; echo ${#array[@]} )

A much more general question might be: How can I do useful things with lists of files in bash?

Answer 1

Shamelessly stolen from Greg's BashFAQ :

unset a i
while IFS= read -r -d $'\0' file; do
    a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)

Note that the redirection construct used here ( cmd1 < <(cmd2) ) is similar to, but not quite the same as the more usual pipeline ( cmd2 | cmd1 ) -- if the commands are shell builtins (eg while ), the pipeline version executes them in subshells, and any variables they set (eg the array a ) are lost when they exit. cmd1 < <(cmd2) only runs cmd2 in a subshell, so the array lives past its construction. Warning: this form of redirection is only available in bash, not even bash in sh-emulation mode; you must start your script with #!/bin/bash .

Also, because the file processing step (in this case, just a[i++]="$file" , but you might want to do something fancier directly in the loop) has its input redirected, it cannot use any commands that might read from stdin. To avoid this limitation, I tend to use:

unset a i
while IFS= read -r -u3 -d $'\0' file; do
    a[i++]="$file"        # or however you want to process each file
done 3< <(find /tmp -type f -print0)

...which passes the file list via unit 3, rather than stdin.

Answer 2

Maybe you are looking for xargs:

find . -print0 | xargs -r0 do_something_useful

The option -L 1 could be useful for you too, which makes xargs exec do_something_useful with only 1 file argument.

Answer 3

The main problem is, that the delimiter NUL (\\0) is useless here, because it isn't possible to assign IFS a NUL-value. So as good programmers we take care, that the input for our program is something it is able to handle.

First we create a little program, which does this part for us:

#!/bin/bash
printf "%s" "$@" | base64

...and call it base64str (don't forget chmod +x)

Second we can now use a simple and straightforward for-loop:

for i in `find -type f -exec base64str '{}' \;`
do 
  file="`echo -n "$i" | base64 -d`"
  # do something with file
done

So the trick is, that a base64-string has no sign which causes trouble for bash - of course a xxd or something similar can also do the job.

Answer 4

另一种计算文件的方法：

find /DIR -type f -print0 | tr -dc '\0' | wc -c

Answer 5

Since Bash 4.4, the builtin mapfile has the -d switch (to specify a delimiter, similar to the -d switch of the read statement), and the delimiter can be the null byte. Hence, a nice answer to the question in the title

Capturing output of find . -print0 find . -print0 into a bash array

is:

mapfile -d '' ary < <(find . -print0)

Answer 6

You can safely do the count with this:

find . -exec echo ';' | wc -l

(It prints a newline for every file/dir found, and then count the newlines printed out...)

Answer 7

I think more elegant solutions exists, but I'll toss this one in. This will also work for filenames with spaces and/or newlines:

i=0;
for f in *; do
  array[$i]="$f"
  ((i++))
done

You can then eg list the files one by one (in this case in reverse order):

for ((i = $i - 1; i >= 0; i--)); do
  ls -al "${array[$i]}"
done

This page gives a nice example, and for more see Chapter 26 in the Advanced Bash-Scripting Guide .

Answer 8

Avoid xargs if you can:

man ruby | less -p 777 
IFS=$'\777' 
#array=( $(find ~ -maxdepth 1 -type f -exec printf "%s\777" '{}' \; 2>/dev/null) ) 
array=( $(find ~ -maxdepth 1 -type f -exec printf "%s\777" '{}' + 2>/dev/null) ) 
echo ${#array[@]} 
printf "%s\n" "${array[@]}" | nl 
echo "${array[0]}" 
IFS=$' \t\n'

Answer 9

I am new but I believe that this an answer; hope it helps someone:

STYLE="$HOME/.fluxbox/styles/"

declare -a array1

LISTING=`find $HOME/.fluxbox/styles/ -print0 -maxdepth 1 -type f`


echo $LISTING
array1=( `echo $LISTING`)
TAR_SOURCE=`echo ${array1[@]}`

#tar czvf ~/FluxieStyles.tgz $TAR_SOURCE

Answer 10

This is similar to Stephan202's version, but the files (and directories) are put into an array all at once. The for loop here is just to "do useful things":

files=(*)                        # put files in current directory into an array
i=0
for file in "${files[@]}"
do
    echo "File ${i}: ${file}"    # do something useful 
    let i++
done

To get a count:

echo ${#files[@]}

Answer 11

Old question, but no-one suggested this simple method, so I thought I would. Granted if your filenames have an ETX, this doesn't solve your problem, but I suspect it serves for any real-world scenario. Trying to use null seems to run afoul of default IFS handling rules. Season to your tastes with find options and error handling.

savedFS="$IFS"
IFS=$'\x3'
filenames=(`find wherever -printf %p$'\x3'`)
IFS="$savedFS"

Answer 12

Gordon Davisson's answer is great for bash. However a useful shortcut exist for zsh users:

First, place you string in a variable:

A="$(find /tmp -type f -print0)"

Next, split this variable and store it in an array:

B=( ${(s/^@/)A} )

There is a trick: ^@ is the NUL character. To do it, you have to type Ctrl+V followed by Ctrl+@.

You can check each entry of $B contains right value:

for i in "$B[@]"; echo \"$i\"

Careful readers may notice that call to find command may be avoided in most cases using ** syntax. For example:

B=( /tmp/** )

Answer 13

Bash has never been good at handling filenames (or any text really) because it uses spaces as a list delimiter.

I'd recommend using python with the sh library instead.

Capturing output of find . -print0 into a bash array

Question

13 answers

solution1
99 ACCPTED 2009-07-13 17:36:50

solution2
7 2009-07-12 22:08:17

solution3
5 2011-10-29 10:47:07

solution4
4

solution5
3 2017-09-14 15:37:59

solution6
2 2009-07-12 22:11:06

solution7
1 2009-07-12 21:48:37

solution8
1

solution9
1

solution10
0 2009-07-13 04:39:55

solution11
0 2016-02-13 02:24:43

solution12
0 2016-06-24 10:05:58

solution13
-1 2013-01-06 13:14:29

Capturing output of find . -print0 into a bash array

Question

13 answers

solution1 99 ACCPTED 2009-07-13 17:36:50

solution2 7 2009-07-12 22:08:17

solution3 5 2011-10-29 10:47:07

solution4 4

solution5 3 2017-09-14 15:37:59

solution6 2 2009-07-12 22:11:06

solution7 1 2009-07-12 21:48:37

solution8 1

solution9 1

solution10 0 2009-07-13 04:39:55

solution11 0 2016-02-13 02:24:43

solution12 0 2016-06-24 10:05:58

solution13 -1 2013-01-06 13:14:29

solution1
99 ACCPTED 2009-07-13 17:36:50

solution2
7 2009-07-12 22:08:17

solution3
5 2011-10-29 10:47:07

solution4
4

solution5
3 2017-09-14 15:37:59

solution6
2 2009-07-12 22:11:06

solution7
1 2009-07-12 21:48:37

solution8
1

solution9
1

solution10
0 2009-07-13 04:39:55

solution11
0 2016-02-13 02:24:43

solution12
0 2016-06-24 10:05:58

solution13
-1 2013-01-06 13:14:29