从文件bash工具中删除列

Question

I have a large file with about 200,000 columns and about 5000 rows. 我有一个大文件，大约有200,000列和大约5000行。 Here is a short example of the file, with columns 1 and 5 duplicated. 这是文件的简短示例，其中第1列和第5列重复。

Abf Bgj Csd Daa Abf Efg ...  
0   1   2   1   0   1.1   
2   0.1 1.2 0.3 2   1    
...

Here is an example of the result I need. 这是我需要的结果的示例。 Column 5 in the original file has been deleted. 原始文件中的第5列已被删除。

Abf Bgj Csd Daa Efg ...  
0   1   2   1   1.1    
2   0.1 1.2 0.3 1      
...

Some of the columns are duplicated several times. 一些列重复了几次。 I need to remove the duplicates from the data (keeping the first instance) using bash tools. 我需要使用bash工具从数据中删除重复项（保留第一个实例）。 I can´t sort the data because I need to keep the order. 我无法对数据进行排序，因为我需要保持顺序。

Answer 1

$ cat tst.awk
NR==1 {
    for (i=1;i<=NF;i++) {
        if (!seen[$i]++) {
            f[++nf]=i
        }
    }
}
{
    for (i=1;i<=nf;i++) {
        printf "%s%s", $(f[i]), (i<nf?OFS:ORS)
    }
}

$ awk -f tst.awk file | column -t
Abf  Bgj  Csd  Daa  Efg
0    1    2    1    1.1
2    0.1  1.2  0.3  1

Answer 2

You can use datamash program: 您可以使用datamash程序：

datamash -W transpose < input.txt | datamash rmdup 1 | datamash transpose

GNU datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files. GNU datamash是一个命令行程序，它对输入的文本数据文件执行基本的数字，文本和统计操作。

Explanation: 说明：

datamash -W transpose < input.txt
- transpose - swap rows and columns. 转置 -交换行和列。 Rows now are columns and columns are rows. 现在行是列，列是行。
- -W - use whitespace (one or more spaces and/or tabs) for field delimiters. -W-使用空格（一个或多个空格和/或制表符）作为字段定界符。
datamash rmdup 1 - remove duplicates lines by the first column value datamash rmdup 1通过第一列值删除重复的行
datamash transpose - swap rows and columns back datamash transpose -向后交换行和列

input 输入

Abf Bgj Csd Daa Abf Efg
0   1   2   1   0   1.1   
2   0.1 1.2 0.3 2   1

output 输出

Abf Bgj Csd Daa Efg
0   1   2   1   1.1
2   0.1 1.2 0.3 1

从文件bash工具中删除列

问题描述

2 个解决方案

解决方案1
5 2017-08-18 11:37:38

解决方案2
0 2017-08-18 12:51:24

从文件bash工具中删除列

问题描述

2 个解决方案

解决方案1 5 2017-08-18 11:37:38

解决方案2 0 2017-08-18 12:51:24

解决方案1
5 2017-08-18 11:37:38

解决方案2
0 2017-08-18 12:51:24