简体   繁体   中英

Update data to mysql if row does not exists using Python

Context: I have a table in mysql database which has the format like this. Every row is one day stock price and volume data

Ticker,Date/Time,Open,High,Low,Close,Volume
AAA,7/15/2010,19.581,20.347,18.429,18.698,174100
AAA,7/16/2010,19.002,19.002,17.855,17.855,109200
BBB,7/19/2010,19.002,19.002,17.777,17.777,104900
BBB,7/19/2010,19.002,19.002,17.777,17.777,104900
CCC,7/19/2010,19.002,19.002,17.777,17.777,104900
....100000 rows

This table is created by importing the data from multiple *.txt file with the same column and format. The *.txt file name is the same with the ticker name in ticker column: ie: import AAA.txt get me the 2 rows of AAA data.

All these *.txt file is generated automatically by a system that retrieve stock price in my country. Every day, after the stock market close, the .txt file will have one new row according to the data of the new day.

Question: everyday, how could I update the new row in each txt file into the database, I do not want to load all the data in the .txt file in mysql table everyday because it take a lot of time, I only want to load new rows.

How should I write the code to do this updating mission.

(1) Create/use an empty stage table, no prmary ... :

 create table db.temporary_stage (
    ... same columns as your orginial table , but no constraints or keys or an index ....

 )

(2) # this should be really fast

  LOAD DATA INFILE 'data.txt' INTO TABLE db.temporary_stage;

(3) join on id then use a hash function to eliminate all rows that haven't changed. the following can be made better, but all in all using bulk loads against databases is a lot faster when you have lots of rows, and thats mostly down to how the database moves stuff about internally. it can do upkeep much more efficiently all at once than a little at a time.

   UPDATE mytable SET 
           mytable... = temporary_stage...
           precomputed_hash = hash(concat( .... ) )
   FROM
   ( 
            SELECT temporary_stage.* from mytable join 
               temporary_stage on mytable.id = temporary_state.id
               where  mytable.pre_computed_hash != hash(concat( .... ) ) ) 
     AS new_data on mytable.id = new_data.id

# clean up

DELETE FROM temporary_stage;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM