简体   繁体   中英

Import specific columns from text-file into mysql.. is this possible?

I've just downloaded a bunch of text files from data.gov, and there are fields in the text file that I really don't need.

Is there a way to import columns [1,3] and leave the rest?

I figure I'll import using 'load data in file', but didn't see anything on the mysql page as to how to only import certain columns. http://dev.mysql.com/doc/refman/5.0/en/load-data.html

The fields are delimited by ^. Just so I'm clear, if a line in the txt file is

00111^first column entry^second column entry^this would be the 3rd column

I am trying to get my mysql table to contain

first column entry | this would be the 3rd column

You can import the specific columns with:

LOAD DATA LOCAL INFILE 'yourFile' INTO TABLE table_name
FIELDS TERMINATED BY '^' (column1, @dummy, column3, @dummy);

Put all columns which you don't need in @dummy.

You could always create a table with a dummy column(s) which you drop after loading the file (assuming you don't have to load the file very often).

Something like this:

LOAD DATA LOCAL INFILE '/path/to/file' INTO TABLE table_name
FIELDS TERMINATED BY '^' (dummy_column1, column1, dummy_column2, column2);
ALTER TABLE table_name DROP dummy_column1;
ALTER TABLE table_name DROP dummy_column2;

Assuming a Unix platform, you could filter the fields upstream.

cut -d^ -f2,4 mygovfile.dat > mytable.txt

To filter the first and third column, then import using your preferred method. For instance

mysqlimport --local -uxxx -pyyy mydb --fields-terminated-by="^" mytable.txt ....

The two most common ways of dealing with this:

  1. Import the data just as it is into a staging table, move what you need into your "real" tables, then truncate the staging table.
  2. Use a text utility to snip out just what you need.

My text utility of choice is awk. A minimal awk script--which probably won't work for you without some tweaking--would look like this.

$ awk 'BEGIN { FS="^";OFS=",";}{print $2, $4}' test.dat
first column entry,this would be the 3rd column

What kind of tweaking? It usually involves taking care of embedded commas, single quotes, and double quotes.

This part

BEGIN { FS="^";OFS=",";}{print $2, $4}

is the whole awk program.

awk rocks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM