[英]Manipulate nth column of a .csv file with awk or sed
I have a.csv file with 6 columns:我有一个包含 6 列的 .csv 文件:
source raised_time cleared_time cause pcause sproblem
source1 rtime1 ctime1 cause1 communicationsSubsystemFailure#model.route.1.2 oMCIFailure#model.route.1.2
source2 rtime2 ctime2 cause2 equipmentMalfunction#model.route.1.2 deviceNotActive#model.route.1.2
I want to manipulate the 5th and 6th columns of the.csv file with below rules:我想使用以下规则操作 .csv 文件的第 5 列和第 6 列:
So the wanted format is:所以想要的格式是:
source raised_time cleared_time cause pcause sproblem
source1 rtime1 ctime1 cause1 Communication Subsystem Failure OMCI Failure
source2 rtime2 ctime2 cause2 Equipment Malfunction Device Not Active
How can I do that with awk or sed command?如何使用 awk 或 sed 命令来做到这一点?
I tried to start with converting the first letter to upper case with the command:我尝试使用以下命令将第一个字母转换为大写:
awk 'BEGIN {$5 = toupper(substr($5,1,1))
substr($5, 2)}1' input_file
but it did not work.但它没有用。
You said your input is CSV (Comma-Separated Values) but there's are no commas in it while it does have apparently random spacing between fields so I assume you actually meant TSV (Tab-Separated Values).您说您的输入是 CSV (逗号分隔值),但是其中没有逗号,而字段之间确实有明显的随机间距,所以我假设您实际上是指 TSV(制表符分隔值)。 If so then this should do what you want:
如果是这样,那么这应该做你想要的:
$ cat tst.awk
BEGIN { FS=OFS="\t" }
NR > 1 {
for (i=5; i<=NF; i++) {
new = ""
old = $i
sub(/#.*/,"",old)
while ( match(old,/[[:upper:]][[:lower:]]+/) ) {
new = new substr(old,1,RSTART-1) " " substr(old,RSTART,RLENGTH)
old = substr(old,RSTART+RLENGTH)
}
new = new old
$i = toupper(substr(new,1,1)) substr(new,2)
}
}
{ print }
. .
$ awk -f tst.awk file
source raised_time cleared_time cause pcause sproblem
source1 rtime1 ctime1 cause1 Communications Subsystem Failure OMCI Failure
source2 rtime2 ctime2 cause2 Equipment Malfunction Device Not Active
A GNU sed
implementation, assuming input file format is tsv (tab separated values):一个 GNU
sed
实现,假设输入文件格式是 tsv(制表符分隔值):
sed -E '1! {
s/\t/\n/4
h
s/[^\n]*//
s/#[^\t]*//g
s/\B[[:upper:]][[:lower:]]/ &/g
s/\b[[:lower:]]/\U&/g
H
g
s/\n.*\n/\t/
}' file.tsv
If fields are separated by ,
then just replace the \t
with the ,
.如果字段由 分隔
,
则只需将\t
替换为,
。
If fields are separated by non-blank to blank transition then put s/^\s+//; s/\s+$//; s/\s+/\t/g
如果字段由非空白到空白转换分隔,则输入
s/^\s+//; s/\s+$//; s/\s+/\t/g
s/^\s+//; s/\s+$//; s/\s+/\t/g
s/^\s+//; s/\s+$//; s/\s+/\t/g
at the beginning of the sed
expression. s/^\s+//; s/\s+$//; s/\s+/\t/g
在sed
表达式的开头。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.