简体   繁体   English

MySQL选择百万条记录以生成URL

[英]MySQL Selecting million records to generate urls

I currently getting a 2 million records from different tables to generate a url to create a sitemap. 我目前从不同的表格中获得200万条记录,以生成一个用于创建站点地图的网址。 The script eat too much resources and use 100% of the servers performance 该脚本占用了太多资源,并占用了100%的服务器性能

query 询问

 SELECT CONCAT("/url/profile/id/",u.id,"/",nickname) as url FROM users AS u
    UNION ALL
    Select CONCAT("url/city/", c.id, "/paramId/",p.id,"/",Replace(p.title, " ", "+"),"/",r.region_Name,"/",c.city_Name) AS url
    From city c 
    Join region r On r.id = c.id_region 
    Join country country On country.id = c.id_country
    cross join param p
    Where country.used = 1
    And p.active = 1 

//i store it on an array $url_list then process for creating a sitemap..but it takes time and to much resources //我将其存储在$ url_list数组中,然后创建站点地图..但是这需要时间和大量资源

//i tried to get the data by batch using LIMIT 0,50000 but getting the maxrow for paging takes time. // i尝试使用LIMIT 0,50000批量获取数据,但获取分页的最大行需要时间。 also the code doesn't look good for i have to run a two query that has a large data 代码也不适合我必须运行两个具有大数据的查询

$url_list = array();


$maxrow = SELECT COUNT(*) AS max from (
 SELECT CONCAT("/url/profile/id/",u.id,"/",nickname) as url FROM users AS u
        UNION ALL
        Select CONCAT("url/city/", c.id, "/paramId/",p.id,"/",Replace(p.title, " ", "+"),"/",r.region_Name,"/",c.city_Name) AS url
        From city c 
        Join region r On r.id = c.id_region 
        Join country country On country.id = c.id_country
        cross join param p
        Where country.used = 1
        And p.active = 1) as tmp

$limit = 50,000;
$bybatch = ceil($maxrow/$limit);
$start = 0;
for($i = 0;$i < $bybatch; $i++){
   // run query and store to $result
       (SELECT CONCAT("/url/profile/id/",u.id,"/",nickname) as url FROM users AS u
        UNION ALL
        Select CONCAT("url/city/", c.id, "/paramId/",p.id,"/",Replace(p.title, " ", "+"),"/",r.region_Name,"/",c.city_Name) AS url
        From city c 
        Join region r On r.id = c.id_region 
        Join country country On country.id = c.id_country
        cross join param p
        Where country.used = 1
        And p.active = 1 LIMIT $start,$limit); 

     $start += $limit;
     //push to $url_list
     $url_list = array_push($result);
}

//when finish i use this to create a site map //完成后,我将使用它来创建站点地图

$linkCount = 1;
        $fileNomb = 1;
        $i = 0;
foreach ($url_list as $ul) { 

            $i += 1; 
            if ($linkCount == 1) {
                $doc  = new DOMDocument('1.0', 'utf-8');
                $doc->formatOutput = true;
                $root = $doc->createElementNS('http://www.sitemaps.org/schemas/sitemap/0.9', 'urlset');
                $doc->appendChild($root);
            }


            $url= $doc->createElement("url");
            $loc= $doc->createElement("loc", $ul['url']); 
            $url->appendChild($loc);
            $priority= $doc->createElement("priority",1); 
            $url->appendChild($priority);


            $root->appendChild($url);

            $linkCount += 1;

            if ($linkCount == 49999) { 
                $f = fopen($this->siteMapMulti . $fileNomb .'.xml', "w");
                fwrite($f,$doc->saveXML());
                fclose($f);

                $linkCount = 1;
                $fileNomb += 1;
            }

        }

Any better way to do this? 还有更好的方法吗? or to speed up the performance? 还是要加快性能?

Added 添加

Why is this faster than sql query but consumes 1 hundred percent of the servers resources and performance 为什么这比sql查询要快,但却消耗了100%的服务器资源和性能

$this->db->query('SELECT c.id, c.city_name, r.region_name, cr.country_name FROM city AS c, region AS r, country AS cr  WHERE r.id = c.id_region AND cr.id = c.id_country AND cr.id IN (SELECT id FROM country WHERE use = 1)');

$arrayCity = $this->db->recordsArray(MYSQL_ASSOC);

 $this->db->query('SELECT id, title FROM param WHERE active = 1');

$arrayParam = $this->db->recordsArray(MYSQL_ASSOC);

foreach ($arrayCity as $city) {
        foreach ($arrayParam as $param) {
          $paramTitle = str_replace(' ', '+', $param['title']);
          $url = 'url/city/'. $city['id'] .'/paramId/'. $param['id'] .'/'. $paramTitle .'/'. $city['region_name'] .'/'. $city['city_name'];
          $this->addChild($url);
        }
}

I suggest you not to use UNION and just issue two separated queries. 我建议您不要使用UNION而只发出两个分开的查询。 It will speed up a query itself. 它将加速查询本身。 Also as you mentioned above it's good idea to get data by batches. 另外,正如您上面提到的,最好分批获取数据。

And finally, don't collect all data in memory. 最后,不要收集内存中的所有数据。 Immediately write it to file in your loop. 立即将其写入循环中的文件。

Just open file in beginning, write each URL entry in loop and close file in end. 只是在开始时打开文件,在循环中写入每个URL条目,然后在结束时关闭文件。

— open file for writing —打开文件进行写入

— count query users table —计数查询用户表

— do several selects with LIMIT in loop (as you already done) —用LIMIT循环进行几次选择(如您已经做的那样)

— right here in loop while ($row = mysql_fetch_array()) write each row to file —就在这里循环, while ($row = mysql_fetch_array())将每一行写入文件

and than repeat such algorithm for another table. 然后再针对另一张表重复这种算法。 It would be useful to implement a function for writing data to file, so you can call that function and adhere to the DRY principle. 实现将数据写入文件的功能将很有用,因此您可以调用该功能并遵守DRY原理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM