Storing MySQL values as integers

Question

I have two database tables that I am using to create a Twitter-style following system.

sh_subscriptions
    => id
    => user_id
    => feed_id

sh_feeds
    => id
    => item
    => shop_name
    => feed_id

The problem with storing feed_id rather than shop_name in sh_subscriptions is that it requires a lot of table joining:

$id = $_POST['id'];
$user_id = $id['id'];
$shop_name = mysqli_escape_string($con, $_POST['shop_name']);

$query = "SELECT * FROM sh_subscriptions s INNER JOIN sh_feeds f ON s.feed_id = f.feed_id WHERE s.user_id = $user_id AND f.shop_name = '$shop_name'";
$result = mysqli_query($con, $query) or die(mysqli_error($con));

if (mysqli_num_rows($result) > 0)
{
    $query2 = "DELETE FROM sh_subscriptions s INNER JOIN sh_feeds f ON s.feed_id = f.feed_id WHERE s.user_id = $user_id AND f.shop_name = '$shop_name'";
    $result2 = mysqli_query($con, $query2) or die(mysqli_error($con));
}

else
{
    // insert the row instead
}

(I know there's an error somewhere in the if statement, but I'll worry about that later.)

If I were to replace feed_id with shop_name , I would be able to replace line 5 with this:

$query = "SELECT * FROM sh_subscriptions WHERE user_id = $user_id AND shop_name = '$shop_name'";

My question is: is it always preferable to store MySQL values as integers where possible, or in a situation like this, would it be faster to have sh_subscriptions contain shop_name rather than feed_id ?

Answer 1

Your sh_subscriptions table is actually a many-to-many join table that relates users to feeds. This is considered a fine way to design database schemas.

Your basic concept is this: you have a collection of users and a collection of feeds. Each user can subscribe to zero or more feeds, and each feed can have zero or more subscribers.

To enter a subscription you create a row in the sh_subscriptions table. To cancel it you delete the row.

You say there's "a lot of table joining." With respect, this is not a lot of table joining. MySQL is made for this kind of joining, and it will work well.

I have some suggestions about your sh_subscriptions table.

get rid of the id column. Instead make the user_id and feed_id columns into a composite primary key. That way you will automatically prevent duplicate subscriptions.
add an active column ... a short integer ... to the table. When it is set to a value of 1 your suscription is active. That way you can cancel a subscription by setting active to 0.
you might also add a subscribed_date column if you care about that.
create two compound non unique indexes (active,user_id,feed_id) and (active,feed_id,userId) on the table. These will greatly accelerate queries that join tables like this.

Query fragment:

   FROM sh_feed f
   JOIN sh_subscription s ON (f.feed_id = s.feed_id AND s.active = 1)
   JOIN sh_users u ON (s.user_id = u.user_id)
  WHERE f.shop_name = 'Joe the Plumber'

If you get to the point where you have hundreds of millions of users or feeds, you may need to consider denormalizing this table.. that is, for example, relocating the shop name text so it's in the sh_subscriptions table. But not now.

Edit I am proposing multiple compound covering indexes . If you're joining feeds to users, for example, MySQL starts satisfying your query by determining the row in sh_feeds that matches your selection.

It then determines the feed_id, and random-accesses your compound index on feed_id. Then, it needs to look up all the user_id values for that feed_id. It can do that by scanning the index from the point where it random-accessed it, without referring back to the table. This is very fast indeed. It's called a covering index .

The other covering index deals with queries that start with a known user and proceed to look up the feeds. The order of columns in indexes matters: random access can only start with the first (leftmost) column of the index.

The trick to understand is that these indexes are both randomly accessible and sequentially scannable.

one other note If you only have two columns in the join table, one of your covering indexes is also your primary key, and the other contains the columns in the reverse order from the primary key. You don't need any duplicate indexes.

Storing MySQL values as integers

Question

1 answers

solution1
2 ACCPTED 2014-05-05 13:03:21

Storing MySQL values as integers

Question

1 answers

solution1 2 ACCPTED 2014-05-05 13:03:21

solution1
2 ACCPTED 2014-05-05 13:03:21