简体   繁体   English

如果该行包含sed / awk或bash的特定单词,同时又保持白色间距,如何更改列中的值?

[英]How can I change a value in a column if that line contains a specific word using sed/awk or bash while keeping the whitespacing?

I have a pdb that looks like: 我有一个pdb看起来像:

ATOM      1  P     A 2   1     224.160 179.728 151.662  1.00 40.00           P  
ATOM      2  OP1   A 2   1     225.507 179.132 151.738  1.00 40.00           O  
ATOM      3  CA    A 2   1     223.640 180.497 152.816  1.00 40.00           O  
ATOM      4  O5'   A 2   1     224.374 180.738 150.465  1.00 40.00           O 

I want to change the 11th column to 1.0000 if a line contains atom CA and save these changes in the same file. 如果行包含原子CA,我想将第11列更改为1.0000,并将这些更改保存在同一文件中。

How can I do that using sed, awk or bash so that I keep the same spacing between the columns? 如何使用sed,awk或bash做到这一点,以使列之间保持相同的间距? Thank you 谢谢

Awk will do the job. Awk会做的。

awk '$1  == "ATOM" && $3 == "CA" { $11 = 1.0 } { print }' <infile > outfile

Google awk for more information, as this is a basic tool worth learning Google awk了解更多信息,因为这是值得学习的基本工具

Assuming fixed width columns, as per comment below, the awk script can be modified to specify FIELDWIDTHS. 假定固定宽度的列(如以下注释所示),可以将awk脚本修改为指定FIELDWIDTHS。 The values need to be checked, as question not clear about exact widths. 需要检查这些值,因为尚不清楚确切的宽度。

awk -v 'FIELDWIDTHS=4 8 6 4 1 6 9 9 9 6 5 12' '
$1  == "ATOM" && $3 == "CA" { $11 = 1.0 }
{ print }
'

sed -E '/ CA /s/[^ ]+/1.000/11' file

(GNU sed, assuming spaces and not tabs) (GNU sed,假设使用空格而不使用制表符)

This uses 11 after the replacement to replace the 11th word. 替换后使用11替换第11个字。 The replacement only happens on lines matching / CA / 替换仅发生在匹配/ CA /

-E is required for the + to work as intended. -E是使+正常工作所必需的。

You may want to tailor the whitespace or replacement string to your exact requirements. 您可能需要根据您的确切要求定制空格或替换字符串。 Since's it's only affecting the 11th column, you can do exactly whatever you want. 因为它只影响第11列,所以您可以做任何您想做的事情。

The following sed command(s) will work: 以下sed命令将起作用:

sed '/ CA /s/\([^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+[^ ]\+ \+\)....../\11.0000/'

or: 要么:

sed -E '/ CA /s/([^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +[^ ]+ +)....../\11.0000/'

or (with bash): 或(使用bash):

X="[^ ]+ +"; sed -E "/ CA /s/($X$X$X$X$X$X$X$X$X$X)....../\11.0000/"

or: 要么:

X="[^ ]\+ \+"; sed "/ CA /s/\($X$X$X$X$X$X$X$X$X$X\)....../\11.0000/"

to give: 给予:

ATOM      1  P     A 2   1     224.160 179.728 151.662  1.00 40.00           P  
ATOM      2  OP1   A 2   1     225.507 179.132 151.738  1.00 40.00           O  
ATOM      3  CA    A 2   1     223.640 180.497 152.816  1.00 1.0000          O  
ATOM      4  O5'   A 2   1     224.374 180.738 150.465  1.00 40.00           O

Explanation: 说明:

  • / CA / if a line contains the token "CA", then / CA /如果一行包含令牌“ CA”,则
  • s/($X$X$X$X$X$X$X$X$X$X)....../ replace the first ten columns and the first six characters of the 11th column by s/($X$X$X$X$X$X$X$X$X$X)....../替换为前十列和第11列的前六个字符
  • \\11.0000/ what was already in the ten columns, and by "1.0000" in the 11th. \\11.0000/十列中已经存在的内容,第11列中为“ 1.0000”。

Refinements: 细化:

  • This assumes the "CA" is not at the start of the first column; 假设“ CA”不在第一列的开头; this can be fixed using /\\<CA\\>/ . 可以使用/\\<CA\\>/
  • If there are tabs, replace spaces in the above with [[:space]] . 如果有选项卡,请用[[:space]]替换上面的[[:space]]
  • The above fails if the existing 11th column has more than six non-blank characters. 如果现有的第11列包含六个以上的非空白字符,则以上操作将失败。 If you know in advance that it has say at most eight characters, add two extra dots to ...... and two spaces after the "1.0000". 如果你事先知道它说的最多八个字符,添加两个额外的点来......在“1.0000”后两个空格。
  • Otherwise, you can first reduce the 11th column to a single non-blank character by running: 否则,您可以首先通过运行以下命令将第11列减少为单个非空白字符:

     X="[^ ]\\+ \\+"; sed "/ CA /{:a;s/\\($X$X$X$X$X$X$X$X$X$X\\)\\([^ ]\\+\\)[^ ] /\\1\\2 /;ta}" 

If you know that the 11th column is always 16 characters wide, the following sed command: 如果您知道第11列始终为16个字符宽,请使用以下sed命令:

sed '/ CA /s/[^ ]\+ \+/1.0000          /11'

will give: 会给:

ATOM      1  P     A 2   1     224.160 179.728 151.662  1.00 40.00           P  
ATOM      2  OP1   A 2   1     225.507 179.132 151.738  1.00 40.00           O  
ATOM      3  CA    A 2   1     223.640 180.497 152.816  1.00 1.0000          O  
ATOM      4  O5'   A 2   1     224.374 180.738 150.465  1.00 40.00           O

Explanation: On lines with the token CA , this replaces the 11th column with 1.0000 followed by 10 spaces. 说明:在带有令牌CA行上,这将第11列替换为1.0000后跟10个空格。

With some versions of sed , you may need to replace \\+ with \\{1,\\} , as in: 对于某些版本的sed ,您可能需要将\\+替换为\\{1,\\} ,如下所示:

sed '/ CA /s/[^ ]\{1,\} \{1,\}/1.0000          /11'

Alternatively, if you know that the 11th column always begins at the 62nd character and is 16 characters wide, the following will also work: 另外,如果您知道第11列始终从第62个字符开始并且为16个字符宽,则以下内容也将起作用:

sed -i '/ CA /s/\(.\{61\}\).\{16\}/\11.0000          /' filename

Explanation: 说明:

  • On lines with the token "CA", / CA / 在标记为“ CA”的行中, / CA /
  • Capture the first 61 characters with \\(.\\{61\\}\\) , and keep them with \\1 使用\\(.\\{61\\}\\)捕获前61个字符,并将其保留为\\1
  • And replace the next 16 characters, .\\{16\\} , with 1.0000 followed by 10 spaces. 并将后16个字符.\\{16\\}替换为1.0000后跟10个空格。
  • The -i switch modifies the file in place. -i开关在适当位置修改文件。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我如何创建Linux脚本(bash,sed,awk…)在配置文件的特定部分添加一行 - How can i create a Linux Script (bash, sed, awk…) to add a line in a specific section of a configuration's file 如何使用awk或sed调整bash中列字段的长度? - How can I adjust the length of a column field in bash using awk or sed? 如何使用 sed 或 awk 写入特定的行范围? - How can I write to a specific line range with sed or awk? 如何使用cat,sed,awk或cut将列添加到csv文件中的特定位置? - How can I add a column to a specific position in a csv file using cat, sed, awk or cut? 如何使用“sed 或 awk”从 bash 中的行中删除最后一个逗号 - how to remove last comma from line in bash using "sed or awk" how can i extract and relate the value of the child xml value with the parent one with sed, awk or xmllint in bash unix? - how can i extract and relate the value of the child xml value with the parent one with sed, awk or xmllint in bash unix? 如何使用sed替换特定单词后的文本? - How can I replace text after a specific word using sed? 如何使用sed或awk在文件的特定行插入特定字符? - How to insert a specific character at a specific line of a file using sed or awk? Bash:如何(1)从包含一列数字的文件的“ i”行中读取数字,以及(2)将值分配给变量? - Bash: how can I (1) read in a number from line “i” in a file that contains a column of numbers and (2) assign the value to a variable? 使用AWK或SED如何删除第一列的字符数不等于13的任何行 - Using AWK or SED how can I remove any line where the first column's character count doesn't equal 13
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM