简体   繁体   English

在bash中以相反的顺序读取和存储数据

[英]Reading and storing the data in reverse order in bash

I have copied 2 column data to a file. 我已将2列数据复制到文件中。 Since the cluster key of my_date is set to return in descending order 由于my_date的群集密钥设置为按降序返回

     echo "copy home.admin (id,my_date) to 'myOutputFile';" > copyInputs.cql

myOutputFile - myOutputFile -

     TEST1,2015-01-01 15:00:00+0000
     TEST1,2014-09-04 14:00:00+0000
     4.VOD,2015-08-18 04:00:00+0000
     4.VOD,2015-06-26 04:00:00+0000
     4.VOD,2015-05-13 04:00:00+0000
     000TEST8,2015-11-19 05:00:00+0000

First column is id and second is my_date. 第一列是id,第二列是my_date。 I wanted to read the data in reverse order for each id. 我想以相反的顺序读取每个id的数据。 So the output should be like this- 所以输出应该像这样 -

     TEST1,2014-09-04 14:00:00+0000
     TEST1,2015-01-01 15:00:00+0000
     4.VOD,2015-05-13 04:00:00+0000
     4.VOD,2015-06-26 04:00:00+0000
     4.VOD,2015-08-18 04:00:00+0000
     000TEST8,2015-11-19 05:00:00+0000

After getting this output am preparing an update statement to populate one new column my_rev.my_rev will start from 100 for eaach id and increment until i find a new id. 获得此输出后,我准备一个更新语句来填充一个新列my_rev.my_rev将从100开始为eaach id并递增,直到我找到一个新的id。

    update home.admin my_rev =100 where id = 'TEST1' and my_date = '2014-09-04 14:00:00+0000';
    update home.admin my_rev =101 where id = 'TEST1' and my_date = '2015-01-01 15:00:00+0000';
    update home.admin my_rev =100 where id = '4.VOD' and my_date = '2015-05-13 04:00:00+0000';
    update home.admin my_rev =101 where id = '4.VOD' and my_date = '2015-06-26 04:00:00+0000';
    update home.admin my_rev =102 where id = '4.VOD' and my_date = '2015-08-18 04:00:00+0000';

Any suggestion? 有什么建议吗?

I wanted to read the data in reverse order for each id 我想以相反的顺序读取每个id的数据

This prints each id in reverse order: 这将以相反的顺序打印每个id

$ awk -F, '$1==prev {s=$0 "\n" s; next} { printf "%s",s; s=$0 "\n"; prev=$1} END{printf "%s",s}' infile
TEST1,2014-09-04 14:00:00+0000
TEST1,2015-01-01 15:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000

How it works: 这个怎么运作:

This script uses two variables. 该脚本使用两个变量。 prev contains the ID for the previous line. prev包含上一行的ID。 s contains the lines for the most recent ID in reverse order. s以相反的顺序包含最新ID的行。

  • -F,

    This tells awk to use a comma as the field separator. 这告诉awk使用逗号作为字段分隔符。

  • $1==prev {s=$0 "\\n" s; next}

    For lines that have the same ID (field 1, denoted $1 ), this adds the new line to the beginning of variable s . 对于具有相同ID(字段1,表示为$1 )的行,这会将新行添加到变量s的开头。 The rest of the commands are skipped and awk jumps to the next line. 其余命令被跳过,awk跳转到next行。

  • printf "%s",s; s=$0 "\\n"; prev=$1

    If we get here, we are starting a new ID. 如果我们到这里,我们将开始一个新的ID。 In this case, we print the lines saved in s from the previous ID. 在这种情况下,我们从先前的ID打印保存在s的行。 We update s with the current line and we set prev to the current ID . 我们用当前行更新s ,并将prev设置为当前ID

  • END{printf "%s",s}

    After we reach the end of the file, we print s for the last ID. 在我们到达文件末尾之后,我们打印s作为最后一个ID。

Alternative 替代

If you want to do a more complex re-ordering, this invokes sort , with all of its flexibility, for each id , keeping each id in its original order: 如果您想进行更复杂的重新排序,则会为每个id调用sort ,并具有所有灵活性,并保持每个id的原始顺序:

$ awk -F, -v s=sort '$1==prev {print | s; next} {close(s); print | s; prev=$1}' infile
TEST1,2014-09-04 14:00:00+0000
TEST1,2015-01-01 15:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000

Re-formatting 重新格式化

If outfile contains the output of the sorting command above, then run: 如果outfile包含上面排序命令的输出,则运行:

$ awk -F, '{if ($1==prev)n++; else n=100; prev=$1; printf "update home.admin my_rev =%i where id = '\''%s'\'' and my_date = '\''%s'\'';\n",n,$1,$2}' outfile
update home.admin my_rev =100 where id = 'TEST1' and my_date = '2014-09-04 14:00:00+0000';
update home.admin my_rev =101 where id = 'TEST1' and my_date = '2015-01-01 15:00:00+0000';
update home.admin my_rev =100 where id = '4.VOD' and my_date = '2015-05-13 04:00:00+0000';
update home.admin my_rev =101 where id = '4.VOD' and my_date = '2015-06-26 04:00:00+0000';
update home.admin my_rev =102 where id = '4.VOD' and my_date = '2015-08-18 04:00:00+0000';
update home.admin my_rev =100 where id = '000TEST8' and my_date = '2015-11-19 05:00:00+0000';

sort should do the trick sort应该做的伎俩

sort -r -t, -k1,2 infile

In general, the only option you need is -r . 通常,您需要的唯一选择是-r

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM