简体   繁体   中英

PHP/MySQL - Performance considerations querying (select only) 48k rows

I am currently attempting to build a web application that relies quite heavily on postcode data (supplied from OS CodePoint Open ). The postcode database has 120 tables which breaks down the initial postcode prefix (ie SE, WS, B). Inside these tables there are between 11k - 48k rows with 3 fields (Postcode, Lat, Lng).

What I need to be able to do is for a user to come online, enter their postcode ie SE1 1LD which then selects the SE table, and converts the postcode into a lat / lng.

I am fine with doing this on a PHP level. My concern is.. well the huge number of rows that will be queried and whether it is going to grind my website to a halt?

If there are any techniques that I should know about, please do let me know.. I've never worked with tables with big numbers in!

Thanks:)

48K are not big numbers. 48 million is. :) If your tables are properly indexed (put indexes on the fields you use in the WHERE clause) it won't be a problem at all.

Avoid LIKE , and use INNER JOINS instead of LEFT JOINs if possible.

selecting from 48k rows in mysql is not big, in fact its rather small. index it properly and you are fine.

If I understand correct, there is a SE table, a WS one, a B one, etc. In all, 120 tables with same structure (Postcode, Lat, Lng) .

I strongly propose you normalize the tables.

You can have either one table:

postcode( prefix, postcode, lat, lng)

or two:

postcode( prefixid , postcode, lat, lng )

prefix( prefixid, prefix ) 

The postcode table will be slighly bigger than 11K-48K rows, about 30K x 120 = 3.6M rows but it will save you time for writing different queries for every prefix and quite complex ones if, for example, you want to search for latitude and longitude (imagine a query that searches in 120 tables).

If you are not convinced try to add a person table so you can add data for your users. How this table will be related to the postcode table(s)?


EDIT

Since the prefix is just the first characters of the postcode which is also the primary key , there is no need for extra field or second table. I would simply combine the 120 tables into one:

postcode( postcode, lat, lng )

Then queries like:

SELECT * 
FROM postode
WHERE postcode = 'SE11LD'

or

SELECT * 
FROM postode
WHERE postcode LIKE 'SE%'

will be fast, as they will be using the primary key index.

As long as you have indexes on the appropriate columns, there should be no problem. One of my customers has the postcode database stored in a table like:

CREATE TABLE `postcode_geodata` (
`postcode` varchar(8) NOT NULL DEFAULT '',
`x_coord` float NOT NULL DEFAULT '0',
`y_coord` float NOT NULL DEFAULT '0',
UNIQUE KEY `postcode_idx` (`postcode`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

And we have no problems (from a performance point of view) in querying that.

If your table did become really large, then you could always look at using MySQL's partitioning support - see http://dev.mysql.com/doc/refman/5.1/en/partitioning.html - but I wouldn't look at that until you've done the easier things first (see below).

If you think performance is an issue, turn on MySQL's slow_query_log (see /etc/mysql/my.cnf) and see what it says (you may also find the command 'mysqldumpslow' useful at this point for analysing the slow query log).

Also try using the 'explain' syntax on the MySQL cli - eg

EXPLAIN SELECT a,b,c FROM table WHERE d = 'foo' and e = 'bar'

These steps will help you optimise the database - by identifying which indexes are (or aren't) being used for a query.

Finally, there's the mysqltuner.pl script (see http://mysqltuner.pl ) which helps you optmise the MySQL server's settings (eg query cache, memory usage etc which will affect I/O and therefore performance/speed).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM