[英]Reading and storing the data in reverse order in bash
I have copied 2 column data to a file. 我已将2列数据复制到文件中。 Since the cluster key of my_date is set to return in descending order
由于my_date的群集密钥设置为按降序返回
echo "copy home.admin (id,my_date) to 'myOutputFile';" > copyInputs.cql
myOutputFile - myOutputFile -
TEST1,2015-01-01 15:00:00+0000
TEST1,2014-09-04 14:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000
First column is id and second is my_date. 第一列是id,第二列是my_date。 I wanted to read the data in reverse order for each id.
我想以相反的顺序读取每个id的数据。 So the output should be like this-
所以输出应该像这样 -
TEST1,2014-09-04 14:00:00+0000
TEST1,2015-01-01 15:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000
After getting this output am preparing an update statement to populate one new column my_rev.my_rev will start from 100 for eaach id and increment until i find a new id. 获得此输出后,我准备一个更新语句来填充一个新列my_rev.my_rev将从100开始为eaach id并递增,直到我找到一个新的id。
update home.admin my_rev =100 where id = 'TEST1' and my_date = '2014-09-04 14:00:00+0000';
update home.admin my_rev =101 where id = 'TEST1' and my_date = '2015-01-01 15:00:00+0000';
update home.admin my_rev =100 where id = '4.VOD' and my_date = '2015-05-13 04:00:00+0000';
update home.admin my_rev =101 where id = '4.VOD' and my_date = '2015-06-26 04:00:00+0000';
update home.admin my_rev =102 where id = '4.VOD' and my_date = '2015-08-18 04:00:00+0000';
Any suggestion? 有什么建议吗?
I wanted to read the data in reverse order for each id
我想以相反的顺序读取每个id的数据
This prints each id
in reverse order: 这将以相反的顺序打印每个
id
:
$ awk -F, '$1==prev {s=$0 "\n" s; next} { printf "%s",s; s=$0 "\n"; prev=$1} END{printf "%s",s}' infile
TEST1,2014-09-04 14:00:00+0000
TEST1,2015-01-01 15:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000
How it works: 这个怎么运作:
This script uses two variables. 该脚本使用两个变量。
prev
contains the ID for the previous line. prev
包含上一行的ID。 s
contains the lines for the most recent ID in reverse order. s
以相反的顺序包含最新ID的行。
-F,
This tells awk to use a comma as the field separator. 这告诉awk使用逗号作为字段分隔符。
$1==prev {s=$0 "\\n" s; next}
For lines that have the same ID (field 1, denoted $1
), this adds the new line to the beginning of variable s
. 对于具有相同ID(字段1,表示为
$1
)的行,这会将新行添加到变量s
的开头。 The rest of the commands are skipped and awk jumps to the next
line. 其余命令被跳过,awk跳转到
next
行。
printf "%s",s; s=$0 "\\n"; prev=$1
If we get here, we are starting a new ID. 如果我们到这里,我们将开始一个新的ID。 In this case, we print the lines saved in
s
from the previous ID. 在这种情况下,我们从先前的ID打印保存在
s
的行。 We update s
with the current line and we set prev
to the current ID
. 我们用当前行更新
s
,并将prev
设置为当前ID
。
END{printf "%s",s}
After we reach the end of the file, we print s
for the last ID. 在我们到达文件末尾之后,我们打印
s
作为最后一个ID。
If you want to do a more complex re-ordering, this invokes sort
, with all of its flexibility, for each id
, keeping each id
in its original order: 如果您想进行更复杂的重新排序,则会为每个
id
调用sort
,并具有所有灵活性,并保持每个id
的原始顺序:
$ awk -F, -v s=sort '$1==prev {print | s; next} {close(s); print | s; prev=$1}' infile
TEST1,2014-09-04 14:00:00+0000
TEST1,2015-01-01 15:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000
If outfile contains the output of the sorting command above, then run: 如果outfile包含上面排序命令的输出,则运行:
$ awk -F, '{if ($1==prev)n++; else n=100; prev=$1; printf "update home.admin my_rev =%i where id = '\''%s'\'' and my_date = '\''%s'\'';\n",n,$1,$2}' outfile
update home.admin my_rev =100 where id = 'TEST1' and my_date = '2014-09-04 14:00:00+0000';
update home.admin my_rev =101 where id = 'TEST1' and my_date = '2015-01-01 15:00:00+0000';
update home.admin my_rev =100 where id = '4.VOD' and my_date = '2015-05-13 04:00:00+0000';
update home.admin my_rev =101 where id = '4.VOD' and my_date = '2015-06-26 04:00:00+0000';
update home.admin my_rev =102 where id = '4.VOD' and my_date = '2015-08-18 04:00:00+0000';
update home.admin my_rev =100 where id = '000TEST8' and my_date = '2015-11-19 05:00:00+0000';
sort
should do the trick sort
应该做的伎俩
sort -r -t, -k1,2 infile
In general, the only option you need is -r
. 通常,您需要的唯一选择是
-r
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.