简体   繁体   English

如何通过大量查询数据加快PHP到MySQL查询的速度

[英]How to speed up PHP to MySQL query with large query data

Before code : for loop should run at least 143,792,640,000 times and create the table at least produce 563,760 rows without duplicated I want to know how to speed up or something parallel computing like Hadoop that could accelerate between php and MySQL. 在代码之前:for循环应至少运行143,792,640,000次,并创建表至少产生563,760行,而没有重复的行。我想知道如何加快运行速度或诸如Hadoop之类的并行计算,从而可以在php和MySQL之间加速。

Code below: 代码如下:

MySQL connection MySQL连接

$link=mysql_connect($servername,$username,$password);
mysql_select_db($dbname);
$sql= "INSERT INTO EM (source,target) VALUES ";

for loop read data into MySQL check function if duplicate not insert and update count=count+1 for循环将数据读入MySQL检查功能(如果未插入重复项并更新count = count + 1)

for($i=0;$i<$combine_arr_size;$i++){
    for($j=0;$j<$combine_arr_size;$j++){  

//below check if find the duplicated like a,b we recognize b,a is same thing //下面检查是否发现重复的a,b我们认识到b,a是同一件事

if(check($combine_words_array[$i],$combine_words_array[$j])) {
                $update_query="UPDATE EM SET count = count+1 where (source='$combine_words_array[$i]' AND target='$combine_words_array[$j]') OR (source='$combine_words_array[$j]' AND target='$combine_words_array[$i]');";
                mysql_query($update_query);
            } else {
                if (!$link) {
                    die("Connection failed: " . mysql_error());
                }

//else using insert into table () value to concatenate the string //否则使用insert into table()值连接字符串

    $sql.="('$combine_words_array[$i]','$combine_words_array[$j]'),";     
            mysql_query(substr($sql,0,-1));
            $sql= "INSERT INTO EM (source,target) VALUES ";        
        }
    }
} 

read the all vector align from comebine_word_array[] to combine_word_array[] 读取所有向量从comebine_word_array[]combine_word_array[]对齐方式

below is check function , check if find the pair return value 下面是检查功能,检查是否找到对返回值

function check($src, $trg) {
    $query = mysql_query("SELECT * FROM EM WHERE (source='$src' AND target='$trg') OR (source='$trg' AND target='$src');");
    if (mysql_num_rows($query) > 0) {
        return 1;
    } else {
        return 0;
    }
}

table

+--------+--------------+------+-----+---------+-------+
| Field  | Type         | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+-------+
| source | varchar(255) | YES  |     | NULL    |       |
| target | varchar(255) | YES  |     | NULL    |       |
| count  | int(11)      | NO   |     | 0       |       |
| prob   | double       | NO   |     | 0       |       |
+--------+--------------+------+-----+---------+-------+

now the php code just influence the source ,target and count 现在的PHP代码只是影响源,目标和数量

Put a better processor on your server and increase the RAM, then go to your php.ini settings and raise the maximum allocated memory for all the various memory/processor relative configurations. 在服务器上放置一个更好的处理器并增加RAM,然后转到php.ini设置并为所有各种内存/处理器相关配置提高最大分配的内存。

This will empower the server further and improve the running efficiency. 这将进一步增强服务器的能力并提高运行效率。

If you cannot find your php.ini file. 如果找不到您的php.ini文件。 Create a new php file with the following contents and open it in the browser: 使用以下内容创建一个新的php文件,然后在浏览器中将其打开:

<?php phpinfo(); ?>

Make sure you delete this file after finding out where php.ini is... as an unwanted user (hacker) could find this file and it would give them detailed information leading to vulnerabilities in your server configuration. 确保在找出php.ini的位置后删除此文件...,因为有害用户(黑客)可以找到此文件,并且它将为他们提供导致服务器配置中存在漏洞的详细信息。

Once you've found php.ini, do some looks online to determine settings that are not obvious and increase the memory allocations in various areas. 找到php.ini后,请在线进行一些查找以确定不明显的设置并增加各个区域的内存分配。

It is difficult to know exactly what you want to do with duplicate combinations. 很难确切知道您要对重复组合执行的操作。 For example you are getting every combination of the array, which is going to get lots of duplicates which you will then count twice. 例如,您将获得数组的每个组合,这将获得大量重复项,然后您将计算两次。

However I would be tempted to load the words into an table (possibly a temp table) and then do a cross join of the table against itself to get every combination, and use this to do an INSERT with an on duplicate key clause. 但是,我很想将单词加载到一个表(可能是一个临时表)中,然后对该表进行交叉连接以获取每个组合,然后使用它对onplicate key子句进行INSERT。

Very crudely, something like this:- 非常粗略,像这样:

<?php

$sql = "CREATE TEMPORARY TABLE words
        (
            word varchar(255),
            PRIMARY KEY (`word`),
        )";

$link = mysql_connect($servername,$username,$password);
mysql_select_db($dbname);
$sql = "INSERT INTO words (word) VALUES ";
$sql_parm = array();

foreach($combine_words_array AS $combine_word)
{
    $sql_parm[] = "('".mysql_real_escape_string($combine_word)."')";
    if (count($sql_parm) > 500)
    {
        mysql_query($sql.implode(',', $sql_parm));
        $sql_parm = array();
    }
}

if (count($sql_parm) > 0)
{
    mysql_query($sql.implode(',', $sql_parm));
    $sql_parm = array();
}

$sql = "INSERT INTO EM(source, target)
        SELECT w1.word, w2.word
        FROM words w1
        CROSS JOIN words w2
        ON DUPLICATE KEY UPDATE `count` = `count` + 1
        ";

mysql_query($sql);

This does rely on having a unique key covering both the source and target columns. 这确实依赖于具有覆盖源列和目标列的唯一键。

But whether this is an option depends on the details of the records. 但是,这是否一个选项取决于记录的详细信息。 For example with your current code if there were 2 words (say A and B) you would find the combination A / B and the combination B / A. But both combinations would update the same records 例如,在您当前的代码中,如果有两个单词(例如A和B),则会找到A / B组合和B / A组合。但是这两个组合都会更新相同的记录

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM