简体   繁体   中英

SQL - performance in varchar vs. int

I have a table which has a primary key with varchar data type. And another table with foreign key as varchar datatype.

I am making a join statement using this pair of varchar datatype. Though I am dealing with few number of rows (say hunderd rows), it is taking 60ms . But when the system will finally be deployed, it will have around thousands of rows.

I read Performance of string comparison vs int join in SQL , and concluded that the performance of SQL Query depend upon DB and number of rows it is dealing with.

But when I am dealing with a very large amount of data, would it matter much?

Should I create a new column with a number datatype in both the table and join the table to reduce the time taken by the SQL Query.?

You should use the correct data type for that data that you are representing -- any dubious theoretical performance gains are secondary to the overhead of having to deal with data conversions.

It's really impossible to say what that is based on the question, but most cases are rather obvious. Where they are not obvious are in situations where you have a data element that is represented by a set of digits but which you do not treat as a number -- for example, a phone number.

Clues that you are dealing with this situation are:

  • leading zeroes that must be preserved
  • no arithmetic operations are carried out on the element.
  • string operations are carried out: eg. "take the last four characters"

If that's the case then you probably want to store your "number" as a varchar.

Yes, you should give that a shot. But before you do, make a test version of your db that you populate with the level of data you expect to have in production, and run some tests on not just SELECT, but also INSERT, UPDATE, and DELETE as well. Then make a version with integer keys, and perform equvialent tests.

The numeric-keys WILL be faster, for the simple reason that the keys are of smaller size, but the difference may not be noticeable. Don't blindly trust books when you can test and measure the difference yourself.

(One thing to remember: if there are occasions when all you need from a relation is the value you currently have as its key, your database may run significantly faster if you can skip entire table lookups by just referencing the foreign-key on the records you have.)

Should I create a new column with a number datatype in both the table and join the table to reduce the time taken by the SQL Query.?

If you're in a position where you can change the design of the database with ease then yes, your Primary Key should be an integer. Unless there is a really good reason to have an FK as a varchar, then they should be integers as well.

If you can't change the PK or FK fields, then make sure they're indexed properly. This will eventually become a bottleneck though.

It just does not sound right to me. It will use more space result in more reads etc. Then is the varchar the clustered index key? If so the table is going to get very fragmented.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM