在文件中划分两列并将新列中的输出打印到多个文件的同一文件中

Question

I have a number of files which is in VCF format.That is how it looks like 我有许多VCF格式的文件。这就是它的样子

1   127573  rs7 G   A   79.78   .   AC=1;AF=0.500;AN=2;BaseQRankSum=1.231;ClippingRankSum=-0.358;DB;DP=5;FS=3.979;MLEAC=1;MLEAF=0.500;MQ=60.00;MQ0=0;MQRankSum=0.358;QD=15.96;ReadPosRankSum=1.231  GT:AD:DP:GQ:PL  0/1:2,3:5:27:108,0,27

In which i need to divide the second part of last column and and print the output in new column.. ie, from the above example, its 3 and 5 ( from 10th column 0/1:2,3:5:27:108,0,27) and the output it should look like, That is with 0.6 (ie 3/5) as last column 其中我需要划分最后一列的第二部分，并在新列中打印输出..即，从上面的例子，它的3和5（从第10列0/1：2,3：5：27：108 ，0,27）和它应该看起来的输出，即0.6（即3/5）作为最后一列

 1  127573  rs7 G   A   79.78   .   AC=1;AF=0.500;AN=2;BaseQRankSum=1.231;ClippingRankSum=-0.358;DB;DP=5;FS=3.979;MLEAC=1;MLEAF=0.500;MQ=60.00;MQ0=0;MQRankSum=0.358;QD=15.96;ReadPosRankSum=1.231  GT:AD:DP:GQ:PL  0/1:2,3:5:27:108,0,27 0.6

In order to achieve this I used awk in unix, as follows, 为了达到这个目的，我在unix中使用了awk，如下所示，

cat result_1 |cut -f10 | sed 's/:/\t/g' >sample
cat sample | cut -f2 | sed 's/,/\t/g' | awk '$2!=0 || $3!=0{print $1"\t"$2"\t"$2/$3}' >result_1

But it complains as 但它抱怨道

awk: (FILENAME=- FNR=1) fatal: division by zero attempted

any other alternative solutions in Python or Perl would be great..!!! Python或Perl中的任何其他替代解决方案都会很棒.. !!!

Answer 1

awk '{split($NF, a, /[,:]/); $(++NF) = a[3]/a[4]; print}' file

好的，除以零：

awk '{split($NF, a, /[,:]/); $(++NF) = (a[4]==0 ? "Inf" : a[3]/a[4]); print}' file

Answer 2

Here's one perl way of doing it: 这是一种perl方式：

perl -ne 'chomp;if(/\t[^, ]+,(\d+):0*([1-9]\d*)[\S ]*$/){$n=$1;$d=$2;print("$_\t",$n/$d,"\n")}else{print("$_\t\n")}' < result_1 > result_1.new

This will do it. 这样做。 It will ensure a non-0 positive value for the denominator in the match ([1-9]\\d*), and allows for leading zeros with the '0*' in front of it. 它将确保匹配中分母的非0正值（[1-9] \\ d *），并允许前面带有'0 *'的前导零。

The chomp removes the hard return ("\\n"), so it's tacked on in the print. chomp删除硬回车（“\\ n”），因此它在打印中被加上。

It ensures you're parsing the last column from the last tab to the end of the string and it allows spaces. 它确保您正在解析从最后一个选项卡到字符串末尾的最后一列，并且它允许空格。

The -n wraps the code in while(){...}. -n将代码包装在while（）{...}中。

It adds a tab even if there would have been a division by zero but in that case, leaves the last column empty. 它会添加一个选项卡，即使存在除零，但在这种情况下，将最后一列留空。

You can mv the file afterward if you want to overwrite the original, but I prefer to save precursors as a backup. 如果要覆盖原始文件，可以在之后复制文件，但我更喜欢将前体保存为备份。

There probably exists a more succinct/readable way of doing it in perl or via another language, but this suffices. 在perl中或通过其他语言可能存在更简洁/可读的方式，但这就足够了。

在文件中划分两列并将新列中的输出打印到多个文件的同一文件中

问题描述

2 个解决方案

解决方案1
3 2015-09-29 15:13:51

解决方案2
1 2015-09-29 18:22:11

在文件中划分两列并将新列中的输出打印到多个文件的同一文件中

问题描述

2 个解决方案

解决方案1 3 2015-09-29 15:13:51

解决方案2 1 2015-09-29 18:22:11

解决方案1
3 2015-09-29 15:13:51

解决方案2
1 2015-09-29 18:22:11