I have the following 2 tables:
week_ref
with 730 rows
All I want to do is join those tables on year and week so that the first week day and last week day will display next to the columns of the first table.
Below is my query:
SELECT tab1.year
,tab1.week
,tab1.col3
,tab1.col4
,tab1.col5
,tab1.col6
,tab1.total
,tab1.col7
,week_ref.first_week_day
,week_ref.last_week_day
FROM dtsetname.tab1
JOIN spyros.week_ref ON (week_ref.year = tab1.year AND week_ref.week = tab1.week)
The return of the query returns the 2 extra columns but the rows are 255535. So it is full of duplicates. I used to get how join works, but I guess not anymore xd... Any help on this? The correct output table should only give me 37146 rows since I only just want to add 2 extra columns.
Thanks
The problem is that your week_ref
table has a row for each day rather than per week.
You can select just one day. If you have a weekday number or name (which I'm guessing that you do), that can be used:
FROM dtsetname.tab1 JOIN
spyros.week_ref wr
ON wr.year = tab1.year AND
wr.week = tab1.week AND
wr.dayname = 'Monday'
If such a column is not available, then you can either extract()
the information or aggregate:
FROM dtsetname.tab1 JOIN
(SELECT ANY_VALUE(wr).*
FROM spyros.week_ref wr
GROUP BY wr.year, wr.week
) wr
ON wr.year = tab1.year AND
wr.week = tab1.week
Below is for BigQuery Standard SQL
Before JOIN'ing you just need to dedup data in week_ref table as in below example
#standardSQL
SELECT tab1.year
,tab1.week
,tab1.col3
,tab1.col4
,tab1.col5
,tab1.col6
,tab1.total
,tab1.col7
,week_ref.first_week_day
,week_ref.last_week_day
FROM dtsetname.tab1 tab1
JOIN (SELECT DISTINCT year, week, first_week_day, last_week_day FROM spyros.week_ref) week_ref
ON (week_ref.year = tab1.year AND week_ref.week = tab1.week)
first, I hope that year+week & year+day are primary keys in corresponding tables, otherwise the problem is there.
If so, here is another hint to check: I notice that you join them by year and week, however, in the first table I see many 52 in a week column and in the second one 0 as a value.
There are only 52 weeks in year, plus a day, so is it possible you need to join by
week_ref.year = tab1.year AND week_ref.week = tab1.week+1
I think the solutions mentioned by others should work if you are looking to join to your reference table to get week start/end dates.
However, if you think your tab1
table has definite values in the week
and year
columns (and if I understand your data correctly) you can avoid the join altogether to get your desired results:
select
year
,week
,col3
,col4
,col5
,col6
,total
,col7
,date_sub(weekdate, interval IF(EXTRACT(DAYOFWEEK FROM weekdate) = 1, 6, EXTRACT(DAYOFWEEK FROM weekdate) - 1) day) as first_week_day
,date_add(date_sub(weekdate, interval IF(EXTRACT(DAYOFWEEK FROM weekdate) = 1, 6, EXTRACT(DAYOFWEEK FROM weekdate) - 1) day), interval 6 day) as last_week_day
from (
select
tab1.year
,tab1.week
,tab1.col3
,tab1.col4
,tab1.col5
,tab1.col6
,tab1.total
,tab1.col7
date_add(date(cast(tab1.year as int64), 1, 1), interval cast(tab1.week as int64) week) as weekdate
from `mydataset.tab1` as tab1
)
Hope it helps :)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.