简体   繁体   中英

Need MYSQL query with good performance to update and import data if that does NOT match criteria move to another table

I have a database consist of :

  • A loaddata_temp Table (my main table);
  • A error_data Table;
  • A company_category Table;
  • A company_industry Table;
  • A company_level Table;
  • A company_type Table;

here my main table( loaddata_temp )

company_id  company_name                         company_parent_id  company_type_id  company_category_id  company_industry_id  company_level_id  
----------  -----------------------------------  -----------------  ---------------  -------------------  -------------------  ----------------  
         2  A Plus Lawn Care                                     0          Partner                   PT                  ATL       Head office  
         3  A. L. Price                                          0          Partner                   CV                  ATL       Head office  
         4  A.J. August Fashion Wear                             0          Partner                   UD                  ATL       Head office  
         5  A+ Electronics                                       0          Partner             KOPERASI                  LAT       Head office  
         6  A+ Investments                                       0         Customer               Warung                  AAA       CITY OFFICE  
         7  Aaronson Furniture                                   0            OTHER                   PT                  ATL       Head office  
         8  ABC Markets                                          0             Test                   CV                  ATL       Head office  

the main table has 8 primary key columns. company_parent_id , company_category_id and so on(Every column contained an _id is a primary_key)

here one of my refference table

company_type_id  company_type_description  
---------------  ------------------------  
              1  Costumer                  
              2  Partner                   
              3  Other                     
             18  Competitor   

Background information on what I need to Import data from CSV to Mysql:

i following this code to import thousand data from csv.

http://www.softwareprojects.com/resources/programming/t-how-to-use-mysql-fast-load-data-for-updates-1753.html

link above really help me to improve my query.(thanks to Dawn Rossi)

before insert to the real table( company ), i need to verified or convert the primary key in loaddata_temp into an id.refference from another table like company_category , company_industry and so on.

so i did the following code below.

 $sql_updates[]="UPDATE company 
        LEFT JOIN   loaddata_temp 
        ON      company.company_name = loaddata_temp.company_name
        SET     loaddata_temp.company_name = COALESCE( concat('Error Found Duplicate ',loaddata_temp.company_name),loaddata_temp.company_name)";

$sql_updates[]="UPDATE loaddata_temp 
        LEFT JOIN   demography_country 
        ON  demography_country.demography_country_name = loaddata_temp.demography_country_id
        SET     loaddata_temp.demography_country_id = COALESCE(demography_country.demography_country_id, concat('Error ',loaddata_temp.demography_country_id,' Your Country Not In the List'))";

$sql_updates[]="UPDATE loaddata_temp 
        LEFT JOIN   demography_city 
        ON  demography_city.demography_city_name = loaddata_temp.demography_city_id
        SET     loaddata_temp.demography_city_id = COALESCE(demography_city.demography_city_id,concat('Error ',loaddata_temp.demography_city_id,' : Your city Not In the List'))";

$sql_updates[]="UPDATE loaddata_temp 
        LEFT JOIN   demography_provinces 
        ON  demography_provinces.demography_province_name = loaddata_temp.demography_province_id
        SET     loaddata_temp.demography_province_id = COALESCE(demography_provinces.demography_province_id, concat('Error ',loaddata_temp.demography_province_id,' : Your Province Not In the List'))";
$sql_updates[]="UPDATE loaddata_temp 
        LEFT JOIN   company_type 
        ON  company_type.company_type_description = loaddata_temp.company_type_id
        SET     loaddata_temp.company_type_id = COALESCE(company_type.company_type_id, concat('Error ',loaddata_temp.company_type_id,' : Your company Type Not In the List'))";
$sql_updates[]="UPDATE  loaddata_temp lt
        LEFT JOIN   company_category cc
        ON  cc.company_category_description = lt.company_category_id
        SET     lt.company_category_id = COALESCE(cc.company_category_id, concat('Error ',lt.company_category_id,' : Your Company Category Not In the List'))";
$sql_updates[]="UPDATE loaddata_temp 
        LEFT JOIN   company_level 
        ON  company_level.company_level_description = loaddata_temp.company_level_id
        SET     loaddata_temp.company_level_id = COALESCE(company_level.company_level_id,  concat('Error ',loaddata_temp.company_level_id,' : Your Company Level Not In the List'))";
$sql_updates[]="UPDATE loaddata_temp 
        LEFT JOIN   company_industry 
        ON  company_industry.company_industry_short_description = loaddata_temp.company_industry_id
        SET     loaddata_temp.company_industry_id = COALESCE(company_industry.company_industry_id, concat('Error ',loaddata_temp.company_industry_id,' : Your Company Industry Not In the List'))";

from code above the result is

company_id  company_name                         company_parent_id  company_type_id  company_category_id  company_industry_id  company_level_id  
----------  -----------------------------------  -----------------  ---------------  -------------------  -------------------  ----------------  
         2  A Plus Lawn Care                                     0                2                    3                    1                 1  
         3  A. L. Price                                          0                1                    4                    5                 1  
         4  A.J. August Fashion Wear                             0                2                    5                    7                 1  
         5  A+ Electronics                                       0                2                   23                Error                 1  
         6  A+ Investments                                       0                1                Error                Error            Errror  
         7  Aaronson Furniture                                   0                3                    3                    1                 1  
         8  ABC Markets                                          0            ERROR                    4                    1                 1  

from the result above,i need to separate(move) data which has a error into table error_data with the following code.

INSERT INTO error_data 
    SELECT * FROM loaddata_temp 
    WHERE SUBSTRING_INDEX(company_name,' ',1)='Error' or
          company_category_id REGEXP '^[A-Za-z \:]+$' OR 
          company_type_id REGEXP '^[A-Za-z \:]+$' OR
          company_industry_id REGEXP '^[A-Za-z \:]+$' OR
          company_level_id REGEXP '^[A-Za-z \:]+$' OR
          demography_city_id REGEXP '^[A-Za-z \:]+$' OR
          demography_country_id REGEXP '^[A-Za-z \:]+$' OR
          demography_province_id REGEXP '^[A-Za-z \:]+$'

DELETE FROM loaddata_temp  
WHERE SUBSTRING_INDEX(company_name,' ',1)='Error' or
              company_category_id REGEXP '^[A-Za-z \:]+$' OR 
              company_type_id REGEXP '^[A-Za-z \:]+$' OR
              company_industry_id REGEXP '^[A-Za-z \:]+$' OR
              company_level_id REGEXP '^[A-Za-z \:]+$' OR
              demography_city_id REGEXP '^[A-Za-z \:]+$' OR
              demography_country_id REGEXP '^[A-Za-z \:]+$' OR
              demography_province_id REGEXP '^[A-Za-z \:]+$'

Now in table loaddata_temp there is no data error because it has already moved into table error_data(See Below Table)

company_id  company_name                         company_parent_id  company_type_id  company_category_id  company_industry_id  company_level_id  
----------  -----------------------------------  -----------------  ---------------  -------------------  -------------------  ----------------  
         5  A+ Electronics                                       0                2                   23                Error                 1  
         6  A+ Investments                                       0                1                Error                Error            Errror  
         8  ABC Markets                                          0            ERROR                    4                    1                 1  

The Problem

  • i need query to rollback error_data table into refference table or an original data without an id.(see below)

    company_id company_name company_parent_id company_type_id company_category_id company_industry_id company_level_id


      5 A+ Electronics 0 partner 23 Error 1 6 A+ Investments 0 costumer Error Error Errror 8 ABC Markets 0 ERROR 4 1 1 
  • i need suggestion ABOUT all my code above to make IT prettier and have good performance.

  • if there any refference link related my problem please inform me.

    Your help is gladly appreciated!

Let me see if I can rephrase the question... You have a CSV file with strings (such as 'Head office') and you want Normalize those values according to a Normalization table, turning that string into a number (such as '3')?

Further, the CSV file may have new strings, so you need to INSERT new string-number pairs in the Normalization table?

Then you want to build (or add to) a table that has only the ids (such as '3') not the strings ('Head office').

I discuss a very efficient 2-SQL solution for that task in my blog . The first SQL discovers and inserts any new rows ( INSERT ... SELECT ... LEFT JOIN ... ). The second discovers all the ids ( UPDATE ... JOIN ... SET ... ).

(If this is not the thrust of your question, then I have to complain about the lack of simplicity/clarity.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM