This time, I will ask about something works generally, but when data is huge it fails.
My case is the same in this post.
I used
SELECT `ID`, `PDBID`, `Chain`, `UniProtID`, `PDBASequence`, `pI`, `experiment`, `resolution`
FROM protein p
WHERE `resolution`= (SELECT MAX(`resolution`)
FROM protein
GROUP BY `PDBASequence`
HAVING `PDBASequence` = p.`PDBASequence`)
I also tried:
SELECT `ID`, `PDBID`, `Chain`, `UniProtID`, `PDBASequence`, `pI`, `experiment`, `resolution`
FROM protein p
WHERE `resolution`= (SELECT MAX(`resolution`)
FROM protein
WHERE `PDBASequence` = p.`PDBASequence`)
I have to group by sequences according to PDBASequence. But, at the same time selected representative must be the one which has the max resolution value.
I tried this code on a small set. Working no problem. However, when I tried to run it on real table which has 80980 rows, execution takes almost forever. In addition, my other computer gives Mysql server has gone away error because of the execution type and pocket size. I fixed the settings in my.ini and ran the code again. Nothing changed. Still no result :( What should I do? Thanks I assigned index on resolution in protein table. However,it did not change anything.
See if this version is any better.
SELECT p.`ID`, p.`PDBID`, p.`Chain`, p.`UniProtID`, p.`PDBASequence`, p.`pI`, p.`experiment`, p.`resolution`
FROM protein p
INNER JOIN (SELECT PDBASequence, MAX(`resolution`) AS MaxResolution
FROM protein
GROUP BY `PDBASequence`) q
ON p.PDBASequence = q.PDBASequence
AND p.resolution = q.MaxResolution
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.