执行SQL查询的更有效方法

Question

I've been using the below php and sql for loading schedule information and real time information for passenger trains in the UK. 我一直在使用下面的php和sql来加载英国客运列车的时刻表信息和实时信息。 Essentially you have to find the relevant schedules, and then load the realtime information for each schedule which is in a different table relating to todays trains. 本质上，您必须找到相关的时间表，然后为每个时间表加载实时信息，这些信息位于与今天的火车有关的不同表中。

The query is taking a little longer than is really idea and using lots of CPU% which again isn''t ideal. 该查询花费的时间比真正的想法要长一点，并且使用了大量的CPU％，这也不是理想的选择。 I'm pretty weak when it comes to sql programming so any pointers as to what is inefficient would be great. 当涉及到sql编程时，我非常虚弱，因此任何关于效率低下的指针都将是很棒的。

This is for an android app and so i've tried to all with one call over http. 这是一个android应用程序，因此我尝试通过http进行一次调用。 The prints(*) and > is for splitting the string at the other end. prints（*）和>用于在另一端分割字符串。

Here is the code: 这是代码：

<?

//Connect to the database
 mysql_connect("localhost","XXXX","XXXX")
or die ("No connection could be made to the OpenRail Database");
mysql_select_db("autotrain");
//Set todays date from system and get HTTP parameters for the station,time to find trains         and todays locations table.
$date = date('Y-m-d');
$test = $_GET['station'];
$time = $_GET['time'];
$table = $_GET['table'];

//Find the tiploc associated with the station being searched.
$tiplocQuery = "SELECT tiploc_code FROM allstations WHERE c LIKE '$test';";
$tiplocResult =mysql_query($tiplocQuery);
$tiplocRow = mysql_fetch_assoc($tiplocResult);

$tiploc=$tiplocRow['tiploc_code'];
//Now find the timetabled trains for the station where there exists no departure     information. Goes back two hours to account for any late running.
$timeTableQuery = "SELECT tiplocs.tps_description AS 'C',     locations$table.public_departure, locations$table.id,schedules.stp_indicator
,schedules.train_uid
FROM locations$table, tiplocs, schedules_cache, schedules,activations
WHERE locations$table.id = schedules_cache.id
AND schedules_cache.id = schedules.id
AND schedules.id =activations.id
AND '$date'
BETWEEN schedules.date_from
AND schedules.date_to
AND locations$table.tiploc_code = '$tiploc'
AND locations$table.real_departure LIKE '0'
AND locations$table.public_departure NOT LIKE '0'
AND locations$table.public_departure >='$time'-300
AND locations$table.public_departure <='$time'+300
AND schedules.runs_th LIKE '1'
AND schedules_cache.destination = tiplocs.tiploc
ORDER BY locations$table.public_departure ASC
LIMIT 0,30;";

$timeTableResult=mysql_query($timeTableQuery);


while($timeTablerow = mysql_fetch_assoc($timeTableResult)){
$output[] = $timeTablerow;

}

//Now for each id returned in the timetable, get the locations and departure times so the app may calculate expected arrival times.
foreach ($output as $value) {
$id = $value['id'];
$realTimeQuery ="SELECT     locations$table.id,locations$table.location_order,locations$table.arrival,locations$table.public_arrival,
locations$table.real_arrival,locations$table.pass,locations$table.departure,locations$   table.public_departure,locations$table.real_departure,locations$table.location_cancelled,
tiplocs.tps_description FROM locations$table,tiplocs WHERE id =$id AND     locations$table.tiploc_code=tiplocs.tiploc;";

$realTimeResult =mysql_query($realTimeQuery);
while($row3 = mysql_fetch_assoc($realTimeResult)){
    $output3[] = $row3;
}
print json_encode($output3);
print("*");
unset($output3);
unset($id);
}


print('>');
print json_encode($output);

?>

Many Thanks Matt 非常感谢Matt

Answer 1

A few things I noticed. 我注意到了几件事。

First, you are joining tables in the where clause, like this 首先，您要在where子句中联接表，如下所示

from table1, table2
where table1.something - table2.something

Joining in the from clause is faster 加入from子句更快

from table1 join table2 on table1.something - table2.something

Next, I'm not a php programmer, but it looks like you are running similar queries inside a loop. 接下来，我不是PHP程序员，但看起来您正在循环内运行类似的查询。 If that's true, look for a way to run just one query. 如果是这样，请寻找一种仅运行一个查询的方法。

Edit starts here 编辑从这里开始

This is in response to gazarsgo's that I back up by claim about joins in the where clause being faster. 这是对gazarsgo的回应，我通过声称where子句中的连接更快来进行备份。 He is right, I was wrong. 他是对的，我错了。 This is what I did. 这就是我所做的。 The programming language is ColdFusion: 编程语言是ColdFusion：

<cfsetting showdebugoutput="no">
<cfscript>
fromtimes = ArrayNew(1);
wheretimes = ArrayNew(1);
</cfscript>

<cfloop from="1" to="1000" index="idx">
<cfquery datasource="burns" name="fromclause" result="fromresult">
select count(distinct hscnumber)
from burns_patient p join burns_case c on p.patientid = c.patientid
</cfquery>
<cfset ArrayAppend(fromtimes, fromresult.executiontime)>

<cfquery datasource="burns" name="whereclause" result="whereresult">
select count(distinct hscnumber)
from burns_patient p, burns_case c 
where p.patientid = c.patientid
</cfquery>
<cfset ArrayAppend(wheretimes, whereresult.executiontime)>
</cfloop>
<cfdump var="#ArrayAvg(fromtimes)#" metainfo="no" label="from">
 <cfdump var="#ArrayAvg(wheretimes)#" metainfo="no" label="where">

I did ran it 5 times. 我确实跑了5次。 The results, in milliseconds, follow. 结果以毫秒为单位。

 9.563 9.611
 9.498 9.584 
 9.625 9.548 
 9.831 9.769 
 9.792 9.813

The first number represents joining in the from clause, the second joining in the where clause. 第一个数字表示from子句中的联接，第二个数字表示where子句中的联接。 The first number is lower only 60% of the time. 第一个数字只有60％的时间较低。 Had it been lower 100% percent of the time, it would have shown that joining in the from clause is faster, but that' not the case. 如果它降低了100％的时间，那将表明加入from子句的速度更快，但事实并非如此。

Answer 2

The biggest issue with your setup is this foreach loop because it is unnecessary and results in n number of round trips to the database to execute a query, fetch and analyze the results. 设置的最大问题是此foreach循环，因为它是不必要的，并且导致n次往返数据库以执行查询，获取和分析结果。

foreach ($output as $value) {

Rewrite the initial query to include all of the fields you will need to do your later calculations. 重写初始查询，以包括进行后续计算所需的所有字段。

Something like this would work. 这样的事情会起作用。

SELECT tl.tps_description AS 'C', lc.public_departure, lc.id, s.stp_indicator, s.train_uid,
lc.id, lc.location_order, lc.arrival, lc.public_arrival, lc.real_arrival, lc.pass, lc.departure, lc.real_departure, lc.location_cancelled
FROM locations$table lc INNER JOIN schedules_cache sc ON lc.id = sc.id
  INNER JOIN schedules s ON s.id = sc.id
  INNER JOIN activations a ON s.id = a.id
  INNER JOIN tiplocs tl ON sc.destination = tl.tiploc
WHERE '$date' BETWEEN schedules.date_from AND schedules.date_to
  AND lc.tiploc_code = '$tiploc'
  AND lc.real_departure LIKE '0'
  AND lc.public_departure NOT LIKE '0'
  AND lc.public_departure >='$time'-300
  AND lc.public_departure <='$time'+300
  AND s.runs_th LIKE '1'
ORDER BY lc.public_departure ASC
LIMIT 0,30;

Eliminating n query executions from your page load should dramatically increase response time. 从页面加载中消除n个查询执行将大大增加响应时间。

Answer 3

Ignoring the problems with the code, in order to speed up your query, use the EXPLAIN command to evaluate where you need to add indexes to your query. 忽略代码中的问题，为了加快查询速度，请使用EXPLAIN命令评估需要在其中向查询添加索引的位置。

At a guess, you probably will want to create an index on whatever locations$table.public_departure evaluates to. 猜测一下，您可能想要在locations$table.public_departure计算locations$table.public_departure任何locations$table.public_departure上创建索引。

http://dev.mysql.com/doc/refman/5.0/en/using-explain.html http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

执行SQL查询的更有效方法

问题描述

3 个解决方案

解决方案1
0 2013-02-19 01:36:33

解决方案2
0 已采纳 2013-02-19 03:13:34

解决方案3
0 2013-02-19 03:26:58

执行SQL查询的更有效方法

问题描述

3 个解决方案

解决方案1 0 2013-02-19 01:36:33

解决方案2 0 已采纳 2013-02-19 03:13:34

解决方案3 0 2013-02-19 03:26:58

解决方案1
0 2013-02-19 01:36:33

解决方案2
0 已采纳 2013-02-19 03:13:34

解决方案3
0 2013-02-19 03:26:58