简体   繁体   English

PHP/MySQL - 查询(仅选择)48k 行的性能注意事项

[英]PHP/MySQL - Performance considerations querying (select only) 48k rows

I am currently attempting to build a web application that relies quite heavily on postcode data (supplied from OS CodePoint Open ).我目前正在尝试构建一个 web 应用程序,该应用程序非常依赖邮政编码数据(由OS CodePoint Open提供)。 The postcode database has 120 tables which breaks down the initial postcode prefix (ie SE, WS, B).邮政编码数据库有 120 个表格,这些表格分解了初始邮政编码前缀(即 SE、WS、B)。 Inside these tables there are between 11k - 48k rows with 3 fields (Postcode, Lat, Lng).在这些表中,有 11k - 48k 行,包含 3 个字段(Postcode、Lat、Lng)。

What I need to be able to do is for a user to come online, enter their postcode ie SE1 1LD which then selects the SE table, and converts the postcode into a lat / lng.我需要做的是让用户上网,输入他们的邮政编码,即 SE1 1LD,然后选择 SE 表,并将邮政编码转换为 lat / lng。

I am fine with doing this on a PHP level.我可以在 PHP 级别上执行此操作。 My concern is.. well the huge number of rows that will be queried and whether it is going to grind my website to a halt?我担心的是......好吧,将查询的大量行以及它是否会使我的网站停止运行?

If there are any techniques that I should know about, please do let me know.. I've never worked with tables with big numbers in!如果有任何我应该知道的技术,请告诉我.. 我从来没有使用过大数字的表格!

Thanks:)谢谢:)

48K are not big numbers. 48K 不是大数字。 48 million is. 4800万是。 :) If your tables are properly indexed (put indexes on the fields you use in the WHERE clause) it won't be a problem at all. :) 如果您的表被正确索引(将索引放在您在WHERE子句中使用的字段上),那根本不是问题。

Avoid LIKE , and use INNER JOINS instead of LEFT JOINs if possible.避免LIKE ,并尽可能使用 INNER JOINS 而不是 LEFT JOIN。

selecting from 48k rows in mysql is not big, in fact its rather small.从 mysql 的 48k 行中选择并不大,实际上它相当小。 index it properly and you are fine.正确索引它,你很好。

If I understand correct, there is a SE table, a WS one, a B one, etc. In all, 120 tables with same structure (Postcode, Lat, Lng) .如果我理解正确,有一个SE表、一个WS表、一个B表等。总共有 120 个具有相同结构的表(Postcode, Lat, Lng)

I strongly propose you normalize the tables.我强烈建议您对表格进行规范化

You can have either one table:您可以拥有一张桌子:

postcode( prefix, postcode, lat, lng)

or two:或两个:

postcode( prefixid , postcode, lat, lng )

prefix( prefixid, prefix ) 

The postcode table will be slighly bigger than 11K-48K rows, about 30K x 120 = 3.6M rows but it will save you time for writing different queries for every prefix and quite complex ones if, for example, you want to search for latitude and longitude (imagine a query that searches in 120 tables).邮政编码表将略大于 11K-48K 行,大约 30K x 120 = 3.6M 行,但它可以节省您为每个前缀编写不同查询的时间,如果您想搜索纬度和经度(想象一个在 120 个表中搜索的查询)。

If you are not convinced try to add a person table so you can add data for your users.如果您不相信,请尝试添加person表,以便为您的用户添加数据。 How this table will be related to the postcode table(s)?该表将如何与邮政编码表相关联?


EDIT编辑

Since the prefix is just the first characters of the postcode which is also the primary key , there is no need for extra field or second table.由于prefix只是postcode的第一个字符,也是primary key ,因此不需要额外的字段或第二个表。 I would simply combine the 120 tables into one:我会简单地将 120 个表合并为一个:

postcode( postcode, lat, lng )

Then queries like:然后查询如下:

SELECT * 
FROM postode
WHERE postcode = 'SE11LD'

or或者

SELECT * 
FROM postode
WHERE postcode LIKE 'SE%'

will be fast, as they will be using the primary key index.会很快,因为他们将使用主键索引。

As long as you have indexes on the appropriate columns, there should be no problem.只要您在适当的列上有索引,就应该没有问题。 One of my customers has the postcode database stored in a table like:我的一位客户将邮政编码数据库存储在如下表中:

CREATE TABLE `postcode_geodata` (
`postcode` varchar(8) NOT NULL DEFAULT '',
`x_coord` float NOT NULL DEFAULT '0',
`y_coord` float NOT NULL DEFAULT '0',
UNIQUE KEY `postcode_idx` (`postcode`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

And we have no problems (from a performance point of view) in querying that.我们在查询它时没有问题(从性能的角度来看)。

If your table did become really large, then you could always look at using MySQL's partitioning support - see http://dev.mysql.com/doc/refman/5.1/en/partitioning.html - but I wouldn't look at that until you've done the easier things first (see below).如果您的表确实变得非常大,那么您总是可以考虑使用 MySQL 的分区支持 - 请参阅http://dev.mysql.com/doc/refman/5.1/en/partitioning.ZFC35FDC70D5FC69D26EZC-A但是直到你先完成了更简单的事情(见下文)。

If you think performance is an issue, turn on MySQL's slow_query_log (see /etc/mysql/my.cnf) and see what it says (you may also find the command 'mysqldumpslow' useful at this point for analysing the slow query log).如果您认为性能是一个问题,请打开 MySQL 的 slow_query_log(参见 /etc/mysql/my.cnf)并查看它的内容(此时您可能还会发现命令 'mysqldumpslow' 对于分析慢查询日志很有用)。

Also try using the 'explain' syntax on the MySQL cli - eg还可以尝试在 MySQL cli 上使用“解释”语法 - 例如

EXPLAIN SELECT a,b,c FROM table WHERE d = 'foo' and e = 'bar'

These steps will help you optimise the database - by identifying which indexes are (or aren't) being used for a query.这些步骤将帮助您优化数据库 - 通过确定哪些索引正在(或未)用于查询。

Finally, there's the mysqltuner.pl script (see http://mysqltuner.pl ) which helps you optmise the MySQL server's settings (eg query cache, memory usage etc which will affect I/O and therefore performance/speed).最后,还有 mysqltuner.pl 脚本(参见http://mysqltuner.pl ),它可以帮助您优化 MySQL 服务器的设置(例如查询缓存、ZCD69B4957F06CD818D7BF3D61980E2 的使用率等),因此会影响 I/O9 的使用率。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM