简体   繁体   中英

Remove special character from csv file using bash

I have csv file in which the first([) and second last column (]) contain special character. An example is given below

col1      col2      col3      ..... coln-1   coln  
[number   number    number    ..... number]  number

I want to remove [ from first and ] from second last column using bash script

with this sed 's/]//g' file , I can remove ] . However I have error for [ with same statement.

Your approach with sed is sound. You just need to know that [ and ] are special characters in (all flavors of) regular expressions, therefore they need escaping with backslashes. And to name a choice of two characters, […] is used, so:

sed 's/[\[\]]//g' test.csv

This, however, can be done quicker using tr which can remove given characters:

tr -d '[]' < test.csv > test2.csv

Try this :

sed -i -e 's/^\[\(.*\)\] \(-?[0-9\.]*\)$/\1 \2/g' $file
            ^ ^^ ^^    ^  ^   ^        ^  ^  ^ 
            | || ||    |  |   |        |  |  + -the second match (the number)  
            | || ||    |  |   |        | +---- the first match (the n-1 first fields)      
            | || ||    |  |   |        +------ end of line
            | || ||    |  |   +--------------- a number
            | || ||    |  +------------------- save in memory (\2)
            | || ||    +---------------------- your closing bracket
            | || |+-------------------------- the n-1 first fields
            | || +--------------------------- save in memory (\1)
            | |+----------------------------- your opening bracket
            | +------------------------------ beginning of line    
            +-------------------------------- substitution mode

What it means, in English, is "perform a substituion, replace lines that begin with a [, contain a bunch of things (and remember them), have a ] after that, and a number afterwards (remember it) by the first bunch of things and the number."

The -e means "perform some regexp operation", and the -i means "overwrite the input file with the output of the command".

You can use awk :

awk  '{gsub(/[][]/,"",$1); gsub(/[][]/,"",$(NF-1))} 1' file

Or sed but that will not be limited to the first or second to last column:

sed -e 's/[][]//g' file

The key is the regex [][] where if you have the closing ] immediately after the opening [ it is considered to be part of the character class rather than a regex meta character.

awk '{gsub(/[\[\]]/,"")}1' file

col1      col2      col3      ..... coln-1   coln  
number   number    number    ..... number  number

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM