[英]Linux Command to get fields from CSV files
In csv files on Linux server, I have thousands of rows in below csv format在 csv 服务器上的 Linux 文件中,我有数千行以下 csv 格式
0,20221208195546466,9,200,Above as:2|RAN34f2fb:HAER:0|RAND8365b2bca763:FON:0|RANDa7a5f964900b:ION:0|
I need to get output from all the files on below format (2nd field ie 20221208195546466 and 5th field but value after Above as:
and before first | ie 2 in above example )我需要从以下格式的所有文件中获取 output(第 2 个字段即 20221208195546466 和第 5 个字段,但在
Above as:
并且在第一个 | 之前,即上面示例中的 2)
output: output:
20221208195546466 , 2
Can anyone help me with linux command?谁能帮我 linux 命令?
Edit:编辑:
my attempts我的尝试
I tried but it give field 5th value.我试过了,但它给出了第 5 个值。 How to add field 2 as well?
如何添加字段 2?
cat *.csv | cut -d, -f5|cut -d'|' -f1|cut -d':' -f2|
EDIT: sorted result编辑:排序结果
Now I am using this command (based on Dave Pritlove answer ) awk -F'[,|:]' '{print $2", "$6}' file.csv.现在我正在使用此命令(基于 Dave Pritlove 的回答)awk -F'[,|:]' '{print $2", "$6}' file.csv。 However, I have one more query, If I have to sort the output based on $6 ( value 2 in your example ) then how can i do it?
但是,我还有一个问题,如果我必须根据 6 美元(您的示例中的值为 2)对 output 进行排序,那么我该怎么做? I want result should be displayed in sorted order based on 2nd output field.
我希望结果应根据第 2 个 output 字段按排序顺序显示。 for ex:
例如:
20221208195546366, 20
20221208195546366, 20
20221208195546436, 16
20221208195546436, 16
20221208195546466, 5
20221208195546466, 5
2022120819536466, 2
2022120819536466, 2
Gnu awk
allows multiple field separators to be set, allowing you to delimit each record at ,
, |
Gnu awk
允许设置多个字段分隔符,允许您在,
, |
分隔每条记录, and :
at the same time. , 和
:
同时。 Thus, the following will fish out the required fields from file.csv
:因此,以下将从
file.csv
中找出所需的字段:
awk -F'[,|:]' '{print $2", "$6}' file.csv
Tested on the single record example:在单个记录示例上测试:
echo "0,20221208195546466,9,200,Above as:2|RAN34f2fb:HAER:0|RAND8365b2bca763:FON:0|RANDa7a5f964900b:ION:0|" | awk -F'[,|:]' '{print $2", "$6}'
output: output:
20221208195546466, 2
Assumptions:假设:
:
and the first |
:
和第一个|
之间。 Sample data:样本数据:
$ cat test.csv
0,20221208195546466,9,200,Above as:2|RAN34f2fb:HAER:0|RAND8365b2bca763:FON:0|RANDa7a5f964900b:ION:0|
1,20230124123456789,10,1730,Total ts:7|stuff:HAER:0|morestuff:FON:0|yetmorestuff:ION:0|
One awk
approach:一个
awk
方法:
awk '
BEGIN { FS=OFS="," } # define input/output field delimiter as ","
{ split($5,a,"[:|]") # split 5th field on dual delimiters ":" and "|", store results in array a[]
print $2,a[2] # print desired items to stdout
}
' test.csv
This generates:这会产生:
20221208195546466,2
20230124123456789,7
You can use awk for this:您可以为此使用 awk:
awk -F',' '{gsub(/Above as:/,""); gsub(/\|.*/, ""); print($2, $5)}'
Probably need to adopt regexp a bit.可能需要稍微采用正则表达式。
You might change :
to ,
and |
您可以将
:
更改为,
和|
to ,
then extract 2nd and 6th field using cut
following way, let file.txt
content be to
,
然后使用cut
以下方式提取第2和第6个字段,让file.txt
内容为
0,20221208195546466,9,200,Above as:2|RAN34f2fb:HAER:0|RAND8365b2bca763:FON:0|RANDa7a5f964900b:ION:0|
then然后
tr ':|' ',,' < file.txt | cut --delimiter=',' --output-delimiter=' , ' --fields=2,6
gives output给出 output
20221208195546466 , 2
Explanation: tr
translates ie replace :
using ,
and replace |
解释:
tr
翻译即替换:
使用,
替换|
using ,
then I inform cut
that delimiter in input is ,
output delimiter is, encased in spaces (as stipulated by your desired output) and want 2th and 6th column (not 5th, as it is now Above as
)使用
,
然后我通知cut
输入中,
定界符是 output 定界符被包裹在空格中(根据您想要的输出规定)并且想要第 2 列和第 6 列(不是第 5 列,因为它现在Above as
)
(tested using GNU coreutils 8.30) (使用 GNU coreutils 8.30 测试)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.