[英]How can I change a value in a column if that line contains a specific word using sed/awk or bash while keeping the whitespacing?
I have a pdb that looks like: 我有一个pdb看起来像:
ATOM 1 P A 2 1 224.160 179.728 151.662 1.00 40.00 P
ATOM 2 OP1 A 2 1 225.507 179.132 151.738 1.00 40.00 O
ATOM 3 CA A 2 1 223.640 180.497 152.816 1.00 40.00 O
ATOM 4 O5' A 2 1 224.374 180.738 150.465 1.00 40.00 O
I want to change the 11th column to 1.0000 if a line contains atom CA and save these changes in the same file. 如果行包含原子CA,我想将第11列更改为1.0000,并将这些更改保存在同一文件中。
How can I do that using sed, awk or bash so that I keep the same spacing between the columns? 如何使用sed,awk或bash做到这一点,以使列之间保持相同的间距? Thank you
谢谢
Awk will do the job. Awk会做的。
awk '$1 == "ATOM" && $3 == "CA" { $11 = 1.0 } { print }' <infile > outfile
Google awk
for more information, as this is a basic tool worth learning Google
awk
了解更多信息,因为这是值得学习的基本工具
Assuming fixed width columns, as per comment below, the awk script can be modified to specify FIELDWIDTHS. 假定固定宽度的列(如以下注释所示),可以将awk脚本修改为指定FIELDWIDTHS。 The values need to be checked, as question not clear about exact widths.
需要检查这些值,因为尚不清楚确切的宽度。
awk -v 'FIELDWIDTHS=4 8 6 4 1 6 9 9 9 6 5 12' '
$1 == "ATOM" && $3 == "CA" { $11 = 1.0 }
{ print }
'
sed -E '/ CA /s/[^ ]+/1.000/11' file
(GNU sed, assuming spaces and not tabs) (GNU sed,假设使用空格而不使用制表符)
This uses 11
after the replacement to replace the 11th word. 替换后使用
11
替换第11个字。 The replacement only happens on lines matching / CA /
替换仅发生在匹配
/ CA /
-E
is required for the +
to work as intended. -E
是使+
正常工作所必需的。
You may want to tailor the whitespace or replacement string to your exact requirements. 您可能需要根据您的确切要求定制空格或替换字符串。 Since's it's only affecting the 11th column, you can do exactly whatever you want.
因为它只影响第11列,所以您可以做任何您想做的事情。
The following sed command(s) will work: 以下sed命令将起作用:
sed '/ CA /s/\([^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+\)....../\11.0000/'
or: 要么:
sed -E '/ CA /s/([^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +)....../\11.0000/'
or (with bash): 或(使用bash):
X="[^ ]+ +"; sed -E "/ CA /s/($X$X$X$X$X$X$X$X$X$X)....../\11.0000/"
or: 要么:
X="[^ ]\+ \+"; sed "/ CA /s/\($X$X$X$X$X$X$X$X$X$X\)....../\11.0000/"
to give: 给予:
ATOM 1 P A 2 1 224.160 179.728 151.662 1.00 40.00 P
ATOM 2 OP1 A 2 1 225.507 179.132 151.738 1.00 40.00 O
ATOM 3 CA A 2 1 223.640 180.497 152.816 1.00 1.0000 O
ATOM 4 O5' A 2 1 224.374 180.738 150.465 1.00 40.00 O
Explanation: 说明:
/ CA /
if a line contains the token "CA", then / CA /
如果一行包含令牌“ CA”,则 s/($X$X$X$X$X$X$X$X$X$X)....../
replace the first ten columns and the first six characters of the 11th column by s/($X$X$X$X$X$X$X$X$X$X)....../
替换为前十列和第11列的前六个字符 \\11.0000/
what was already in the ten columns, and by "1.0000" in the 11th. \\11.0000/
十列中已经存在的内容,第11列中为“ 1.0000”。 Refinements: 细化:
/\\<CA\\>/
. /\\<CA\\>/
。 [[:space]]
. [[:space]]
替换上面的[[:space]]
。 ......
and two spaces after the "1.0000". ......
在“1.0000”后两个空格。 Otherwise, you can first reduce the 11th column to a single non-blank character by running: 否则,您可以首先通过运行以下命令将第11列减少为单个非空白字符:
X="[^ ]\\+ \\+"; sed "/ CA /{:a;s/\\($X$X$X$X$X$X$X$X$X$X\\)\\([^ ]\\+\\)[^ ] /\\1\\2 /;ta}"
If you know that the 11th column is always 16 characters wide, the following sed command: 如果您知道第11列始终为16个字符宽,请使用以下sed命令:
sed '/ CA /s/[^ ]\+ \+/1.0000 /11'
will give: 会给:
ATOM 1 P A 2 1 224.160 179.728 151.662 1.00 40.00 P
ATOM 2 OP1 A 2 1 225.507 179.132 151.738 1.00 40.00 O
ATOM 3 CA A 2 1 223.640 180.497 152.816 1.00 1.0000 O
ATOM 4 O5' A 2 1 224.374 180.738 150.465 1.00 40.00 O
Explanation: On lines with the token CA
, this replaces the 11th column with 1.0000
followed by 10 spaces. 说明:在带有令牌
CA
行上,这将第11列替换为1.0000
后跟10个空格。
With some versions of sed , you may need to replace \\+
with \\{1,\\}
, as in: 对于某些版本的sed ,您可能需要将
\\+
替换为\\{1,\\}
,如下所示:
sed '/ CA /s/[^ ]\{1,\} \{1,\}/1.0000 /11'
Alternatively, if you know that the 11th column always begins at the 62nd character and is 16 characters wide, the following will also work: 另外,如果您知道第11列始终从第62个字符开始并且为16个字符宽,则以下内容也将起作用:
sed -i '/ CA /s/\(.\{61\}\).\{16\}/\11.0000 /' filename
Explanation: 说明:
/ CA /
/ CA /
\\(.\\{61\\}\\)
, and keep them with \\1
\\(.\\{61\\}\\)
捕获前61个字符,并将其保留为\\1
.\\{16\\}
, with 1.0000
followed by 10 spaces. .\\{16\\}
替换为1.0000
后跟10个空格。 -i
switch modifies the file in place. -i
开关在适当位置修改文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.