bash method to remove last 4 columns from csv file

Question

Is there a way to use bash to remove the last four columns for some input CSV file? The last four columns can have fields that vary in length from line to line so it is not sufficient to just delete a certain number of characters from the end of each row.

Answer 1

Cut can do this if all lines have the same number of fields or awk if you don't.

cut -d, -f1-6 # assuming 10 fields

Will print out the first 6 fields if you want to control the output seperater use --output-delimiter=string

awk -F , -v OFS=, '{ for (i=1;i<=NF-4;i++){ printf $i, }; printf "\n"}'

Loops over fields up to th number of fields -4 and prints them out.

Answer 2

cat data.csv | rev | cut -d, -f-5 | rev

rev反转这些行，所以如果所有行都具有相同的列数并不重要，它将始终删除最后一行4.这仅在最后4列本身不包含任何逗号时才有效。

Answer 3

You can use cut for this if you know the number of columns. For example, if your file has 9 columns, and comma is your delimiter:

cut -d',' -f -5

However, this assumes the data in your csv file does not contain any commas. cut will interpret commas inside of quotes as delimiters also.

Answer 4

awk -F, '{NF-=4; OFS=","; print}' file.csv

or alternatively

awk -F, -vOFS=, '{NF-=4;print}' file.csv

will drop the last 4 columns from each line.

Answer 5

awk one-liner:

awk -F, '{for(i=0;++i<=NF-5;)printf $i", ";print $(NF-4)}'  file.csv

the advantage of using awk over cut is, you don't have to count how many columns do you have, and how many columns you want to keep. Since what you want is removing last 4 columns.

see the test:

kent$  seq 40|xargs -n10|sed 's/ /, /g'           
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
11, 12, 13, 14, 15, 16, 17, 18, 19, 20
21, 22, 23, 24, 25, 26, 27, 28, 29, 30
31, 32, 33, 34, 35, 36, 37, 38, 39, 40

kent$  seq 40|xargs -n10|sed 's/ /, /g' |awk -F, '{for(i=0;++i<=NF-5;)printf $i", ";print $(NF-4)}'
1,  2,  3,  4,  5,  6
11,  12,  13,  14,  15,  16
21,  22,  23,  24,  25,  26
31,  32,  33,  34,  35,  36

Answer 6

这可能适合你（GNU sed）：

sed -r 's/(,[^,]*){4}$//' file

Answer 7

这种awk解决方案以黑客的方式

awk -F, 'OFS=","{for(i=NF; i>=NF-4; --i) {$i=""}}{gsub(",,,,,","",$0);print $0}' temp.txt

Answer 8

None of the mentioned methods will work properly when having CVS files with quoted fields with a <comma> character. So it is a bit hard to just use the <comma>-character as a field separator.

The following two posts are now very handy:

What's the most robust way to efficiently parse CSV using awk?
[U&L] How to delete the last column of a file in Linux (Note: this is only for GNU awk)

Since you work with GNU awk, you can thus do any of the following two:

$ awk -v FPAT='[^,]*|"[^"]+"' -v OFS="," 'NF{NF-=4}1'

Or with any awk, you could do:

$ awk 'BEGIN{ere="([^,]*|\042[^\042]+\042)"
             ere=","ere","ere","ere","ere"$"
       }
       {sub(ere,"")}1'

bash method to remove last 4 columns from csv file

Question

8 answers

solution1
16 ACCPTED 2013-01-19 20:46:59

solution2
12 2013-01-19 21:50:59

solution3
6 2013-01-19 20:34:29

solution4
4 2015-06-10 20:58:20

solution5
1 2013-01-19 21:17:44

solution6
1 2013-01-19 21:46:54

solution7
1 2013-01-20 05:14:37

solution8
1 2019-07-22 13:54:02

bash method to remove last 4 columns from csv file

Question

8 answers

solution1 16 ACCPTED 2013-01-19 20:46:59

solution2 12 2013-01-19 21:50:59

solution3 6 2013-01-19 20:34:29

solution4 4 2015-06-10 20:58:20

solution5 1 2013-01-19 21:17:44

solution6 1 2013-01-19 21:46:54

solution7 1 2013-01-20 05:14:37

solution8 1 2019-07-22 13:54:02

solution1
16 ACCPTED 2013-01-19 20:46:59

solution2
12 2013-01-19 21:50:59

solution3
6 2013-01-19 20:34:29

solution4
4 2015-06-10 20:58:20

solution5
1 2013-01-19 21:17:44

solution6
1 2013-01-19 21:46:54

solution7
1 2013-01-20 05:14:37

solution8
1 2019-07-22 13:54:02