I have the following 2 tables, api_analytics_data, and telecordia.
CREATE TABLE `api_analytics_data` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`upload_file_id` bigint(20) NOT NULL,
`partNumber` varchar(100) DEFAULT NULL,
`clei` varchar(45) DEFAULT NULL,
`description` varchar(150) DEFAULT NULL,
`processed` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `idx_aad_clei` (`clei`),
KEY `idx_aad_pn` (`partNumber`),
KEY `id_aad_processed` (`processed`),
KEY `idx_combo1` (`partNumber`,`clei`,`upload_file_id`)
) ENGINE=InnoDB CHARSET=latin1;
CREATE TABLE `telecordia` (
`tid` int(11) NOT NULL AUTO_INCREMENT,
`ProdID` varchar(50) DEFAULT NULL,
`Mfg` varchar(20) DEFAULT NULL,
`Pn` varchar(50) DEFAULT NULL,
`Clei` varchar(50) DEFAULT NULL,
`Series` varchar(50) DEFAULT NULL,
`Dsc` varchar(50) DEFAULT NULL,
`Eci` varchar(50) DEFAULT NULL,
`AddDate` date DEFAULT NULL,
`ChangeDate` date DEFAULT NULL,
`Cost` float DEFAULT NULL,
PRIMARY KEY (`tid`),
KEY `telecordia.ProdID` (`ProdID`) USING BTREE,
KEY `telecordia.clei` (`Clei`),
KEY `telecordia.pn` (`Pn`),
KEY `telcordia.eci` (`Eci`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Users upload data via a web interface using Excel/CSV files into api_analytics_data. The data contains EITHER the partNumbers or CLEIs. I then update the api_analytics_data table by joining the telecordia table. The telecordia table is the master list of partNumber and Cleis.
So if a user uploads a file of CLEIs, the update/join I use is:
update api_analytics_data aad
inner join telecordia t on aad.clei = t.Clei
set aad.partNumber = t.Pn
where aad.partNumber is null
and aad.upload_file_id = 5;
It works quickly, but not very thoroughly. The problem I have is that the CLEI uploaded may only be a substring of the CLEI in the telecordia table.
For example, the uploaded CLEI may be " 5SC1DX0 ". In the telcordia table, the correct matching row is:
tid: 184324
ProdID: 472467
Mfg: PLSE
Pn: AUA58-2-REV-E
Clei: 5SC1DX04AA
Series: null
Dsc: DL SGL-PTY POTS CU RT
Eci: 205756
AddDate: 1994-03-18
ChangeDate: 1998-04-13
Cost: null
So obviously my update doesn't work in this case, even though 5SC1DX0 and 5SC1DX04AA are the same part.
What I need is a wildcard search. However, when I try this, it is crazy slow. With about 4500 rows uploaded into the api_analytics_data table, it runs for about 10 minutes, and then loses the connection with the server.
update api_analytics_data aad
inner join telecordia t on aad.clei like concat(t.Clei,'%')
set aad.partNumber = t.Pn
where aad.partNumber is null
and aad.upload_file_id = 5;
Is there a way to optimize this so that it runs quickly?
The correct answer is "no". The better course of action is to create a new column in telecordia
with the correct Clei
value in it, one that can be used for joining the tables. In the most recent versions of MySQL, this can even be a computed column and be indexed.
That said, you might be able to do something if the matching portion is always the same length. If so, try this:
update api_analytics_data aad inner join
telecordia t
on t.Clei = left(aad.clei, 7)
set aad.partNumber = t.Pn
where aad.partNumber is null and aad.upload_file_id = 5;
For this query, you want an index on api_analytics_data(upload_fiel_id, partNumber, clei)
and telecordia(clei, pn)
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.