简体   繁体   中英

BigQuery Join on Calculated Field

I'm trying to join two tables on a calculated field:

SELECT
  CONCAT(year, mo, da) AS date,
  temp
FROM
  [bigquery-public-data:noaa_gsod.gsod2016] AS a
JOIN
  [bigquery-public-data:noaa_gsod.stations] AS b
ON
  a.stn = b.usaf
  AND a.wban = b.wban
JOIN
  [test-project:0000.ga_sessions_20161002] AS c
ON
  c.date = CONCAT(a.year, a.mo, a.da)
WHERE
  b.name = "DUBAI INTL"
ORDER BY
  a.year ASC,
  a.mo ASC,
  a.da ASC

I keep getting the following error:

ON clause must be AND of = comparisons of one field name from each table, with all field names prefixed with table name

I presume this error is a result of the second ON statement but I've prefixed all field names with their table name.

What am I doing wrong?

You need to enable standard SQL for the query instead. Once you do (uncheck "Use Legacy SQL" under "Show Options") you can run eg:

SELECT
  CONCAT(year, mo, da) AS date,
  temp
FROM
  `bigquery-public-data.noaa_gsod.gsod2016` AS a
JOIN
  `bigquery-public-data.noaa_gsod.stations` AS b
ON
  a.stn = b.usaf
  AND a.wban = b.wban
JOIN
  `test-project.0000.ga_sessions_20161002` AS c
ON
  c.date = CONCAT(a.year, a.mo, a.da)
WHERE
  b.name = "DUBAI INTL"
ORDER BY
  a.year ASC,
  a.mo ASC,
  a.da ASC;

Here's a standalone example that doesn't rely on the ga_session table, too:

WITH MyTable AS (
  SELECT '20161003' AS date UNION ALL
  SELECT '20160830'
)
SELECT
  CONCAT(year, mo, da) AS date,
  temp
FROM
  `bigquery-public-data.noaa_gsod.gsod2016` AS a
JOIN
  `bigquery-public-data.noaa_gsod.stations` AS b
ON
  a.stn = b.usaf
  AND a.wban = b.wban
JOIN
  MyTable AS c
ON
  c.date = CONCAT(a.year, a.mo, a.da)
WHERE
  b.name = "DUBAI INTL"
ORDER BY
  a.year ASC,
  a.mo ASC,
  a.da ASC;

Try using a subquery:

FROM (SELECT a.*, date(CONCAT(a.year, a.mo, a.da)) as yyyymmdd
      FROM [bigquery-public-data:noaa_gsod.gsod2016] a
     ) a JOIN
     [bigquery-public-data:noaa_gsod.stations] b
     ON a.stn = b.usaf AND a.wban = b.wban JOIN
     [test-project:0000.ga_sessions_20161002] c
     ON c.date = a.yyyymmdd

Or something like this . . . the exact comparison might depend on the type of c.date .

The JOIN you're using is an INNER JOIN by default, so the ON clause can only do simple equality on the columns.

To work around you'll need to use a CROSS JOIN with a WHERE clause:

SELECT
  CONCAT(year, mo, da) AS date,
  temp
FROM
  [bigquery-public-data:noaa_gsod.gsod2016] AS a
JOIN
  [bigquery-public-data:noaa_gsod.stations] AS b
ON
  a.stn = b.usaf
  AND a.wban = b.wban
CROSS JOIN
  [test-project:0000.ga_sessions_20161002] AS c
WHERE
  b.name = "DUBAI INTL"
AND
  c.date = CONCAT(a.year, a.mo, a.da)
ORDER BY
  a.year ASC,
  a.mo ASC,
  a.da ASC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM