[英]SQL relational database: Data manipulation and calculations
I have two tables.我有两张桌子。 One is a Booking_Platform table with all hotel booking data.
一个是包含所有酒店预订数据的 Booking_Platform 表。 The second is a Customer_Country_Table which stores information for origins of each customer that booked hotels through the platform.
第二个是 Customer_Country_Table,它存储通过平台预订酒店的每个客户的来源信息。
I have to calculate which country has the highest increase in bookings from 2017 to 2018.我必须计算从 2017 年到 2018 年哪个国家/地区的预订量增幅最大。
I will give some sample data below for reference:我将在下面提供一些示例数据以供参考:
Booking_Platform_Info
Booking_Date column2 column3.... column N ....... Origin_Country_ID
20-dec-2016 .................................... 103
03-jan-2017 .................................... 101
09-feb-2017 .................................... 103
23-apr-2017 .................................... 102
06-oct-2017 .................................... 102
11-nov-2017 .................................... 103
05-jan-2018 .................................... 102
21-jan-2018 .................................... 102
26-feb-2018 .................................... 101
09-mar-2018 .................................... 101
11-may-2018 .................................... 103
10-sep-2018 .................................... 102
20-nov-2018 .................................... 101
07-dec-2018 .................................... 101
23-dec-2018 .................................... 101
31-dec-2018 .................................... 103
23-jan-2019 .................................... 103
Customer_Country_Info
Country_ID Country_Name
101 Italy
102 Spain
103 Portugal
It is a bit complicated for me, as I understand I have to first join the tables, then do a group by country, then count the total no.对我来说有点复杂,据我所知,我必须先加入表格,然后按国家/地区分组,然后计算总数。 of bookings by year (probably another group by);
按年份的预订量(可能是另一组); and then compare the results to see which country has the highest positive difference in bookings from 2017-2018.
然后比较结果以查看 2017-2018 年哪个国家/地区的预订量正差异最大。 I welcome any help with coding this problem.
我欢迎任何有关编码此问题的帮助。
In my example, country 101 Italy would be the answer because difference between bookings in 2018 and 2017 is highest (5-1=4)在我的示例中,国家 101 意大利将是答案,因为 2018 年和 2017 年的预订差异最大 (5-1=4)
*********Edit after comments *********评论后编辑
I am writing two queries to get booking totals by country ID for both 2017 and 2018我正在编写两个查询以按国家/地区 ID 获取 2017 年和 2018 年的预订总数
SELECT CAST(booking_date AS DATE), COUNT(*) as number_of_bookings, origin_country_id FROM Booking_Platform_Info
WHERE booking_date >= '2017-01-01' AND
booking_date < '2017-01-01'
GROUP BY origin_country_id;
SELECT CAST(booking_date AS DATE), COUNT(*) as number_of_bookings, origin_country_id FROM Booking_Platform_Info
WHERE booking_date >= '2018-01-01' AND
booking_date < '2019-01-01'
GROUP BY origin_country_id;
Sorry for my lack of knowledge, but I am not aware how to join queries so that I could get the country id with the highest difference in bookings.抱歉,我缺乏知识,但我不知道如何加入查询,以便获得预订差异最大的国家/地区 ID。
You need to join those two queries to compare the counts.您需要加入这两个查询以比较计数。
You also shouldn't include CAST(booking_date AS DATE)
in the SELECT
list.您也不应该在
SELECT
列表中包含CAST(booking_date AS DATE)
。 It's not needed, and it will just be a randomly selected date from the year.它不是必需的,它只是一年中随机选择的日期。
SELECT country_name
FROM (
SELECT a.origin_country_id
FROM (
SELECT origin_country_id, COUNT(*) AS 2017_total
FROM Booking_Platform_Info
WHERE STR_TO_DATE(booking_date, '%d-%b-%Y') BETWEEN '2017-01-01' AND '2017-12-31'
) AS a
JOIN (
SELECT origin_country_id, COUNT(*) AS 2018_total
FROM Booking_Platform_Info
WHERE STR_TO_DATE(booking_date, '%d-%b-%Y') BETWEEN '2018-01-01' AND '2018-12-31'
) AS b
ORDER BY 2018_total - 2017_total
LIMIT 1
) as t1
JOIN Customer_Country_Info AS t2 ON t1.origin_country_id = t2.origin_country_id
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.