简体   繁体   中英

Removing characters from a CSV file

I have a CSV file, that contains data exported from a mySQL table. In one of the fields, there is a newline character, which "splits" the field into 2 lines. I'm trying to remove this newline character, but can't seem to do it.

Also, the same field may contain double quotes and commas, which gives me trouble when I enclose the fields with " terminate them with , when I exported the table. So I used | to terminate the fields instead, and don't enclose the fields with anything.

When I cat the file on a linux machine, the field looks like this

13"\
58,20,"3

What the field is supposed to look like is

13"58,20,"3

When I used the vi "hex editor" ( :%!xxd ) to check the hex values of the line, I get

31 33 22 5c 0a 35 38 2c 32 30 2c 22 33

I tried using sed

sed -e 's/\\\n//'

and

sed -e 's/\x5c\x0a//'

to remove the newline, but they didn't work. How can I format the field to what it's supposed to look like?

Try:

$ sed '/\\$/{N; s/\\\n//}' file
13"58,20,"3

/\\\\$/ selects lines that end with \\ . For those lines, we read in the next line (command N ) and then we do a substitution to remove the unwanted \\ and newline: s/\\\\\\n// .

Lines that do not end with \\ are passed through unchanged.

This approach assumes that continued lines are continued just one time. If there were to be lines with two or more continuations, we would need a loop.

One option to handle this on the MySQL side would to use REPLACE() and remove the newline characters from the column (or columns) which contain them:

SELECT REPLACE(col, '\n', '')
FROM yourTable
INTO OUTFILE '/output.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';

I had the same problem, using the HEX function showed me that I have 2 characters at the end of the field CHAR(13) and CHAR(10) - CR and LF so the solution is to replace both characters - ie

REPLACE(REPLACE(postcode,'\\r',''),'\\n','')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM