简体   繁体   中英

How to speed up MySQL database / query?

I have two tables in my MySQL database, users and tweets, as follows:

TABLE users (
  uid int(7) NOT NULL AUTO_INCREMENT,
  twitter_uid int(10) NOT NULL,
  screen_name varchar(255) NOT NULL,
  `name` varchar(255) NOT NULL,
  tweets int(6) NOT NULL,
  followers_count int(7) NOT NULL,
  statuses_count int(7) NOT NULL,
  created_at int(10) NOT NULL,
  PRIMARY KEY (uid)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

TABLE tweets (
  tweet_id int(11) NOT NULL AUTO_INCREMENT,
  `query` varchar(5) NOT NULL,
  id_str varchar(18) NOT NULL,
  created_at int(10) NOT NULL,
  from_user_id int(11) NOT NULL,
  from_user varchar(256) NOT NULL,
  `text` text NOT NULL,
  PRIMARY KEY (tweet_id),
  KEY id_str (id_str)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

The tweets table contains over 2 million records. I have put the unique users (taken from tweets.from_user) in the users table. It now contains 94,100 users. I now want to count the number of tweets each user made, as follows (in PHP):

res = db_query('SELECT uid, screen_name FROM users WHERE tweets = 0 LIMIT 150');
while ($user = db_fetch_object($result)) {
  $res2 = db_query(
    "SELECT COUNT(tweet_id) FROM tweets WHERE from_user = '%s'",
    $user->screen_name
  );
  $cnt = db_result($result2);
  db_query("UPDATE users SET tweets = %d WHERE uid = %d", $cnt, $user->uid);
}

This code however, is EXTREMELY slow. It takes about 5 minutes to count the tweets of 150 users. Going at this rate, it will take about 3 days to complete this task for all users.

My question is - I MUST be missing something here. Perhaps there is a more efficient query possible or I should change something to the database structure? Any help would be greatly appreciated :)

I think worst problem here is having multiple queries. That's most likely worse than just an issue with indexes. You should try to have one query only.

UPDATE users 
SET users.tweets = (SELECT COUNT(tweet_id) 
                    FROM tweets 
                    WHERE tweets.from_user = users.uid 
                    AND users.tweets =0
                   )

have you indexed all relevant attributes? escpecially from_user should have an index!

I'd start by condensing all of that into a single UPDATE statement:

UPDATE users
   SET tweets =
        ( SELECT COUNT(1)
            FROM tweets
           WHERE tweets.from_user = users.screen_name
        )
 WHERE users.tweets = 0
 LIMIT 150
;

and then I'd look at indices. In particular, make sure there's an index on tweets.from_user . (See http://dev.mysql.com/doc/refman/5.0/en/create-index.html for how to create an index on a table columns.)

While you could significantly speed-up the updating of users.tweets by "condensing" these SQL statements into one (as suggested by other answers), what will you do when user makes a new tweet? How will know that users.tweets needs to be updated again?

  • One way would be to make a trigger that updates users.tweets whenever a row is deleted from or inserted into the tweets table, or when tweets.from_user is modified.
  • You could also remove the users.tweets altogether and just count the tweets dynamically on as-needed basis.

In any case, to speed up the SELECT COUNT(tweet_id) FROM tweets WHERE from_user = '%s' query, you'll need to create an index on {from_user}. Since tweet_id is NOT NULL, COUNT(tweet_id) is equivalent to COUNT(*) - otherwise a composite index on {from_user, tweet_id} would be needed.

第一步是将索引添加到经常用作搜索条件的列。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM