简体   繁体   中英

How do I do a for loop with 2 arrays in shell script?

I have to first declare two arrays which I also need help with.

Originally, it's two single variables.

day=$(hadoop fs -ls -R /user/hive/* | 
        awk '/filename.txt.gz/' |
        tail -1 | 
        date -d $(echo `awk '{print $6}'`) '+%b %-d' | 
        tr -d ' ')

time_stamp=$(hadoop fs -ls -R /user/hive/* | 
             awk '/filename.txt.gz/' |
             tail -1 | 
             awk '{ print $7 }')

Now instead of tail -1 , I need tail -5 . So first, how do I make these two arrays?

Second question, how do I make a for loop with each value from the paired values of $day and $time_stamp ? I can't use array_combine because I need to perform actions on each array separately. Thanks

You are collecting the data into strings, not arrays. But additionally, your code should probably be refactored significantly -- as a general rule of thumb, if something happens in Awk, most of the rest should also happen in Awk.

You assign to an array with variable=(values of array) and to get the values from a subprocess, it's variable=($(command to produce values)) .

Here's a first attempt at refactoring your code.

# Avoid repeated code -- break this out into a function
extract_field () {
    hadoop fs -ls -R /user/hive/* | 
    # Get rid of the tail and the repeated Awk
    # Notice backslashes in regex
    # Pass in the field to extract as a parameter
    awk -v field="$1" '/filename\.txt\.gz/ { d[++i]=$field }
        END { for(j=i-5; j<=i; ++j) print d[j] }'
)

day=($(extract_field 6 |
    # Refactor accordingly
    # And if you don't want a space in the format string, don't put a space in the format string in the first place
    xargs -i {} date -d {} '+%b%-d'))

time_stamp=($(extract_field 7))

I'm highly skeptical of the arrangement to call the Hadoop command twice, though. Perhaps just extract fields 6 and 7 in a single go and then post-process the results to get them into two separate arrays. Something like this instead then?

combined=($(hadoop fs -ls -R /user/hive/* | 
    awk '/filename\.txt\.gz/ { d[++i]=$6 " " $7 }
        END { for(j=i-5; j<=i; ++j) print d[j] }'))
for ((i=0; i<"${#combined[@]}"; ++i)); do
    day[$i]="$(date -d "${combined[i]% *}" +'%b%-d')"
    time_stamp[$i]="${combined[i]#* }"
done
unset combined

The statement that you need to handle the dates and times independently from each other sounds suspicious; if you can find a way to avoid doing that, perhaps after all don't split combined into two separate arrays. The code above reveals how to extract the date and the time from a value in combined (the mechanism is called parameter substitution ). It also obviously demonstrates how to loop over the indices in an array.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM