简体   繁体   English

Bash-正则表达式,确定ls -al的输出是文件还是目录,并且是隐藏的

[英]Bash - Regex to determine if output of ls -al is file or directory and hidden

I am trying to find if each line of output from running ls -al is a file or directory and whether or not it is hidden and count the type of each. 我正在尝试查找运行ls -al的输出的每一行是否是文件或目录,是否将其隐藏,并计算每种类型。

EDIT: It is imperative that I must not use find . 编辑:绝对不能使用find

#!/bin/bash
#declare four different regex statements that match files, hidden files, directories and hidden directories (excluding . and ..)
#based on the output of each line of running ls -al
re_file='^\-[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s[^\.](\w|\.)*$'
re_hidden_file='^\-[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s\.\w(\w|\.)*$'
re_directory='^d[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s[^\.](\w|\.)*$'
re_hidden_directory='^d[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s\.\w(\w|\.)*$'
#declare four different counters for each type
file_count=0
hidden_file_count=0
directory_count=0
hidden_directory_count=0
#read through the output of ls -al line by line, assigning x the value of each line
ls -al $1 | while read x; do
  #test if each line matches each of the regex statements, if it does then increment the relevant counter
  if [[ $x =~ $re_file ]] ; then
    file_count+=1
  elif [[ $x =~ $re_hidden_file ]] ; then
    hidden_file_count+=1
  elif [[ $x =~ $re_directory ]] ; then
    directory_count+=1
  elif [[ $x =~ $re_hidden_directory ]] ; then
    hidden_directory_count+=1
  else
    echo "!!!"
  fi
done
total=$((file_count + hidden_file_count + directory_count + hidden_directory_count))
echo "Files found: $file_count (plus $hidden_file_count hidden)"
echo "Directories found: $directory_count (plus $hidden_directory_count hidden)"
echo "Total files and directories: $total"

Currently the script outputs the !!! 当前脚本输出!!! from not matching any of the Regex statements for each line of ls -al and all of the counter variables remain at 0 . 从不匹配ls -al每一行的任何Regex语句,所有计数器变量均保持为0 Here's an example of the input (though Bash removes the extra spaces used for padding before the Regex checks are done). 这是输入的示例(尽管Bash会在完成Regex检查之前删除用于填充的多余空间)。

drwx--x--x  37 username groupname  4096 Jan  8 14:37 .
drwxr-xr-x 235 root     root       4096 Nov 15 12:16 ..
drwx------   3 username groupname  4096 Oct 27 14:35 .adobe
-rw-------   1 username groupname 14458 Dec  5 20:24 .bash_history
-rw-------   1 username groupname  2680 Sep 30 16:12 .bash_profile
-rw-------   1 username groupname  1210 Oct  7 09:40 .bashrc
drwx------  12 username groupname  4096 Dec  6 15:24 .cache
drwxr-xr-x  17 username groupname  4096 Jan  8 14:37 .config
drwx------   4 username groupname  4096 Dec  5 17:51 dir1
drwx------   2 username groupname  4096 Nov 23 12:26 dir2
...

I have tested the Regex on an online Regex checker and they evaluate as I would like them to. 我已经在在线Regex检查器上对Regex进行了测试,他们按照我的意愿进行评估。 I assume this is a Bash-specific problem. 我认为这是特定于Bash的问题。 Any help is appreciated. 任何帮助表示赞赏。

You should not parse ls to get files. 您不应该解析ls以获取文件。 Use find instead with nul termination or globbing. 使用find代替nul终止或阻塞。

The problem is that ls produces ambiguous output for file names that are otherwise legal file names. 问题是ls会为文件名(否则为合法文件名)产生模糊的输出。 Consider: 考虑:

$ touch a$'\t'b
$ touch a$'\n'b
$ ls -l a*
-rw-r--r--  1 andrew  wheel  0 Jan  8 08:25 a?b
-rw-r--r--  1 andrew  wheel  0 Jan  8 08:26 a?b

The unprintable characters of \\t and \\n are replaced with ? \\t\\n的不可打印字符替换为? and render those files from ls ambiguous. 并从ls渲染这些文件。

The same will happen with trailing spaces: 尾随空格也会发生同样的情况:

$ touch "a b c   "
$ touch "a b c       "
$ ls -al a\ b*
-rw-r--r--  1 andrew  wheel  0 Jan  8 08:44 a b c   
-rw-r--r--  1 andrew  wheel  0 Jan  8 08:44 a b c   

Now consider using find : 现在考虑使用find

$ find . -name "a*" -maxdepth 1 -print0 | xargs -0 printf   "'%s'\n"
'./a    b'
'./a
b'
'./a b c   '
'./a b c      '

Or just globbing: 或只是遍历:

$ for fn in a*; do printf "'%s'\n" "$fn"; done
'a  b'
'a
b'
'a b c   '
'a b c      '

If you want to get total directories and total files including hidden files and directories just add that to your glob pattern: 如果要获取总目录和总文件(包括隐藏文件和目录),只需将其添加到全局模式中:

file_count=0
hidden_file_count=0
regular_directory_count=0
hidden_directory_count=0

echo "=====regular files and directories:"
for fn in *; do 
    printf "'%s'\n" "$fn" 
    if [ -d "$fn" ]; then
        regular_directory_count=$((regular_directory_count+1))
    else
        file_count=$((file_count+1))
    fi      
done
echo "====hidden files and direcotries:"
for fn in .*; do 
    printf "'%s'\n" "$fn"; 
    if [ -d "$fn" ]; then
        hidden_directory_count=$((hidden_directory_count+1))
    else
        hidden_file_count=$((hidden_file_count+1))
    fi          
done

printf "Regular files: %s regular directories: %s\n" $file_count $regular_directory_count
printf "Hidden files:  %s hidden directories:  %s\n" $hidden_file_count $hidden_directory_count
tf=$((hidden_file_count+file_count))
td=$((hidden_directory_count+regular_directory_count))
printf "Total files:   %s total directories:   %s\n"  $tf $td

Given: 鉴于:

$ ls -la
total 0
drwxr-xr-x   9 andrew  wheel   306 Jan  8 11:07 .
drwxrwxrwt  92 root    wheel  3128 Jan  8 10:58 ..
drwxr-xr-x   2 andrew  wheel    68 Jan  8 11:07 .hidden dir
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 .hidden file
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a?b
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a?b
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a b c   
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a b c       
drwxr-xr-x   2 andrew  wheel    68 Jan  8 11:07 regular dir

Run that and you get: 运行它,您会得到:

=====regular files and directories:
'a  b'
'a
b'
'a b c   '
'a b c       '
'regular dir'
====hidden files and direcotries:
'.'
'..'
'.hidden dir'
'.hidden file'
Regular files: 4 regular directories: 1
Hidden files:  1 hidden directories:  3
Total files:   5 total directories:   4

If you want to exclude . 如果要排除. and .. hidden directories you can set GLOBIGNORE=".:.." prior to using the .* glob pattern. ..隐藏目录,可以在使用.* glob模式之前设置GLOBIGNORE=".:.."

Took me a while but got it to work. 花了我一段时间,但让它起作用。

My approach: avoid parsing the output of ls -l . 我的方法:避免解析ls -l的输出。 Specially here you don't need it. 特别是在这里,您不需要它。 Enable options so * in for loop sees hidden objects and test each object against object type (using shopt ). 启用选项,以便for循环中的* for看到隐藏的对象并针对对象类型测试每个对象(使用shopt )。

Also: a+=1 doesn't do what you think it does. 另外: a+=1不会执行您认为的操作。 It just appends 1 at the end of the string! 它只是在字符串的末尾追加1

#!/bin/bash
#declare four different regex statements that match files, hidden files, directories and hidden directories (excluding . and ..)
#based on the output of each line of running ls -al
re_hidden_file='^\..*'
#declare four different counters for each type
file_count=0
hidden_file_count=0
directory_count=0
hidden_directory_count=0

# enable hidden files/directories
shopt -s dotglob
#read through the output of ls -al line by line, assigning x the value of each line
for x in * ; do
  #test if each line matches each of the regex statements, if it does then increment the relevant counter
  if [ -d "$x" ] ; then
  if [[ "$x" =~ $re_hidden_file ]] ; then
    hidden_directory_count=$((hidden_directory_count+1))
  else
    directory_count=$((directory_count+1))
  fi
  else

  if [[ "$x" =~ $re_hidden_file ]] ; then
    hidden_file_count=$((hidden_file_count+1))
  else
    file_count=$((file_count+1))
   fi
   fi
done


total=$((file_count + hidden_file_count + directory_count + hidden_directory_count))
echo "Files found: $file_count (plus $hidden_file_count hidden)"
echo "Directories found: $directory_count (plus $hidden_directory_count hidden)"
echo "Total files and directories: $total"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM