简体   繁体   English

如何在bash中将`ls -l`的输出解析为多个变量?

[英]How to parse the output of `ls -l` into multiple variables in bash?

There are a few answers on this topic already, but pretty much all of them say that it's bad to parse the output of ls -l , and therefore suggest other methods. 关于此主题已经有一些答案,但是几乎所有人都说解析ls -l的输出很不好,因此建议使用其他方法。

However, I'm using ncftpls -l , and so I can't use things like shell globs or find – I think I have a genuine need to actually parse the ls -l output. 但是,我使用的是ncftpls -l ,所以不能使用诸如shell globs之类的东西或find –我认为我真正需要解析ls -l输出。 Don't worry if you're not familiar with ncftpls , the output returns in exactly the same format as if you were just using ls -l . 如果您对ncftpls不熟悉,请不要担心,输出将返回与您仅使用ls -l完全相同的格式。

There is a list of files at a public remote ftp directory, and I don't want to burden the remote server by re-downloading each of the desired files every time my cronjob fires. 在公共远程ftp目录中有文件列表,我不想每次cronjob触发时都重新下载每个所需的文件,从而给远程服务器增加了负担。 I want to check, for each one of a subset of files within the ftp directory, whether the file exists locally; 我要检查ftp目录中文件子集的每个子集,文件是否在本地; if not, download it. 如果没有,请下载。

That's easy enough, I just use 这很简单,我只是使用

tdy=`date -u '+%Y%m%d'`_

# Today's files
for i in $(ncftpls 'ftp://theftpserver/path/to/files' | grep ${tdy}); do
    if [ ! -f $i ]; then
        ncftpget "ftp://theftpserver/path/to/files/${i}"
    fi
done

But I came upon the issue that sometimes the cron job will download a file that hasn't finished uploading, and so when it fires next, it skips the partially downloaded file. 但是我遇到了一个问题,有时cron作业会下载尚未完成上传的文件,因此,当下次启动时,它会跳过部分下载的文件。

So I wanted to add a check to make sure that for each file that I already have, the local file size matches the size of the same file on the remote server. 因此,我想添加一个检查以确保对于我已经拥有的每个文件,本地文件大小与远程服务器上相同文件的大小匹配。

I was thinking along the lines of parsing the output of ncftpls -l and using awk, something like 我一直在考虑解析ncftpls -l的输出并使用awk,类似于

for i in $(ncftpls -l 'ftp://theftpserver/path/to/files' | awk '{print $9, $5}'); do
    ...
    x=filesize   # somehow get the file size and the filename
    y=filename   # from $i on each iteration and store in variables
    ...
done

but I can't seem to get both the filename and the filesize from the server into local variables on the same iteration of the loop; 但是我似乎无法在循环的同一迭代中将服务器的文件名和文件大小都转换为局部变量。 $i alternates between $9 and $5 in the awk string with each iteration. 每次迭代时,$ i在awk字符串中的$ 9和$ 5之间交替。

If I could manage to get the filename and filesize into separate variables with each iteration, I could simply use stat -c "%s" $i to get the local size and compare it with the remote size. 如果我每次迭代都能获取文件名并将其文件化为单独的变量,则可以简单地使用stat -c "%s" $i获取本地大小并将其与远程大小进行比较。 Then its a simple ncftpget on each remote file that I don't already have. 然后在我还没有的每个远程文件上添加一个简单的ncftpget I tinkered with syncing programs like lftp too, but didn't have much luck and would rather do it this way. 我也修改了lftp类的同步程序,但是运气不太好,宁愿这样。

Any help is appreciated! 任何帮助表示赞赏!

for loop splits when it sees any whitespace like space, tab, or newline. 当看到任何空格(例如空格,制表符或换行符)时,for循环将拆分。 So, IFS is needed before loop, (there are a lot of questions about ...) 因此,在循环之前需要IFS((关于...有很多问题)

IFS=$'\n' && for i in $(ncftpls -l 'ftp://theftpserver/path/to/files' | awk '{print $9, $5}'); do

echo $i | awk '{print $NF}' # filesize 
echo $i | awk '{NF--; print}' # filename
# you may have spaces in filenames, so is better to use last column for awk

done

The better way I think is to use while not for, so 我认为更好的方法是不使用而使用

ls -l | while read i
do
echo $i | awk '{print $9, $5}'

#split them if you want 
x=echo $i | awk '{print $5}'
y=echo $i | awk '{print $9}'

done

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM