简体   繁体   中英

How to design my database to accommodate this data

I am developing a database for a payroll application, and one of the features I'll need is a table that stores the list of employees that work at each store, each day of the week.

Each employee has an ID, so my table looks like this:

        |   Mon   |   Tue   |   Wed   |   Thu   |   Fri   |   Sat   |   Sun
Store 1 | 3,4,5   | 3,4,5   | 3,4,5   | 4,5,7   | 4,5,7   | 4,5,6,7 | 4,5,6,7
Store 2 | 1,8,9   | 1,8,9   | 1,8,9   | 1,8,9   | 1,8,9   | 1,8,9   | 1,8,9
Store 3 | 10,12   | 10,12   | 10,12   | 10,12   | 10,12   | 10,12   | 10,12
Store 4 | 15      | 15      | 15      | 16      | 16      | 16      | 16
Store 5 | 6,11,13 | 6,11,13 | 6,11,13 | 14,18,19| 14,18,19| 14,18,19| 14,18,19

My question is, how do I represent that on my database? I came up with the following ideas:

Idea 1 : Pretty much replicate the design above, creating a table with the following columns: [Store_id | Mon | Tue ... | Sat | Sun] and then store the list of employee IDs of each day as a string, with IDs separated by commas. I know that comma-separated lists are not good database design, but sometimes they do look tempting, as in this case.

   Store_id |   Mon   |   Tue   |   Wed   |   Thu   |   Fri   |   Sat   
   ---------+---------+---------+---------+---------+---------+---------
        1   | '3,4,5' | '3,4,5' | '3,4,5' | '4,5,7' | '4,5,7' | '4,5,6,7' 
        2   | '1,8,9' | '1,8,9' | '1,8,9 '| '1,8,9' | '1,8,9' | '1,8,9'   

Idea 2 : Create a table with the following columns: [Store_id | Day | Employee_id]. That way each employee working at a specific store at a specific day would be an entry in this table. The problem I see is that this table would grow quite fast, and it would be harder to visualize the data at the database level.

Store_id | Day | Employee_id
---------+-----+-------------
   1     | mon |     3
   1     | mon |     4
   1     | mon |     5
   1     | tue |     3
   1     | tue |     4

Any of these ideas sound viable? Any better way of storing the data?

The second design is correct for a relational database. One employee_id per row, even if it results in multiple rows per store per day.

The number of rows is not likely to get larger than the RDBMS can handle, if your example is accurate. You have no more than 4 employees per store per day, and 5 stores, and up to 366 days per year. So no more than 7320 rows per year, and perhaps less.

I regularly see databases in MySQL that have hundreds of millions or even billions of rows in a given table. So you can continue to run those stores for many years before running into scalability problems.

if I were you I would store the employee data and stores data in separate tables... but still keep the design of your main table. so do something like this

CREATE TABLE stores (
    id INT, -- make it the primary key auto increment.. etc
    store_name VARCHAR(255)
    -- any other data for your store here.
);

CREATE TABLE schedule (
    id INT, -- make it the primary key auto increment.. etc
    store_id INT, -- FK to the stores table id
    day VARCHAR(20),
    emp_id INT -- FK to the employees table id
);

CREATE TABLE employees
    id INT, -- make it the primary key auto increment.. etc
    employee_name VARCHAR(255)
    -- whatever other employee data you need to store.
);

I would have a table for stores and for employees as that way you can have specific data for each store or employee

BONUS:

if you wanted a query to show the store name with the employees name and their schedule and everything then all you have to do is join the two tables

SELECT s.store_name, sh.day, e.employee_name
FROM schedule sh
JOIN stores s ON s.id = sh.store_id
JOIN employees e ON e.id = sh.emp_id

this query has limitations though because you cannot order by days so you could get data by random days.. so in reality you also need a days table with specific data for the day that way you can order the data by the beginning or end of the week.

if you did want to make a days table it would just be the same thing again

CREATE TABLE days(
    id INT,
    day_name VARCHAR(20),
    day_type VARCHAR(55)
    -- any more data you want here
)

where day name would be Mon Tue... and day_type would be Weekday or Weekend

and then all you would have to do for your query is

SELECT s.store_name, sh.day, e.employee_name
FROM schedule sh
JOIN stores s ON s.id = sh.store_id
JOIN employees e ON e.id = sh.emp_id
JOIN days d ON d.id = sh.day_id
ORDER BY d.id

notice the two colums in the schedule table for day would be replaced with one column for the day_id linked to the days table.

hope thats helpful!

I upvoted John Ruddell's answer, which is basically your option #2 with the addition of tables to hold data about the store and the employee. I won't repeat what he said, but let me just add a couple of thoughts that are too long for a comment:

Never ever ever put comma-separated values in a database record. This makes the data way harder to work with.

Sure, either #1 or #2 makes it easy to query to find which employees are working at store 1 on Friday:

Method 1:

select Friday_employees from schedule where store_id='store 1'

Method 2:

select employee_id from schedule where store_id=1 and day='fri'

But suppose you want to know what days employee #7 is working.

With method 2, it's easy:

select day from schedule where employee_id=7

But how would you do that with method 1? You'd have break the field up into it's individual pieces and check each piece. At best that's a pain, and I've seen people screw it up regularly, like writing

where Friday_employees like '%7%'

Umm, except what if there's an employee number 17 or 27? You'll get them too. You could say

where Friday_employees like '%,7,%'

But then if the 7 is the first or the last on the list, it doesn't work.

What if you want the user to be able to select a day and then give them the list of employees working on that day?

With method 2, easy:

select employee_id from schedule where day=@day

Then you use a parameterized query to fill in the value.

With method 1 ...

select employee_id from schedule where case when @day='mon' then Monday_employees when @day='tue' then Tuesday_employees when @day='wed' then Wednesday_employees when @day='thu' then Thursday_employees when @day='fri' then Friday_employees when @day='sat' then Saturday_employees as day_employees

That's a beast, and if you do it a lot, sooner or later you're going to make a mistake and leave a day out or accidentally type "when day='thu' then Friday_employees" or some such. I've seen that happen often enough.

Even if you write those long complex queries, performance will suck. If you have a field for employee_id, you can index on it, so access by employee will be fast. If you have a comma-separated list of employees, then a query of the "like '%,7,%' variety requires a sequential search of every record in the database.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM