I am trying to create two arrays of structures that are independent of each other.
I am trying to get these arrays created from another table that is an import of a flat file with different record types depending on the data (Rec0015 - Earnings and Rec0025 - Deductions). Each record type has a slightly different layout, so I import the record data as a string that is "~" delimited. I have simplified my examples below to only show the basic data needed to illustrate my issues.
If I run the queries independently of each other each array is created properly; however, when I combine both I cannot get it to work.
For the first query:
SELECT
flat.EmployeeNumber,
ARRAY_AGG(STRUCT(
flat.RecordNumberEarn,
flat.EmployeeEarningsAmount
))
as EmployeeEarningsDetail
FROM
(SELECT DISTINCT
stage.EmployeeNumber,
Rec0015.RecordNumberEarn,
Rec0015.EmployeeEarningsAmount,
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging stage
INNER JOIN
(SELECT EmployeeNumber,
SPLIT(VariableData, '~')[SAFE_OFFSET(0)] AS RecordNumberEarn,
SPLIT(VariableData, '~')[SAFE_OFFSET(4)] AS EmployeeEarningsAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging
WHERE RecordTypeNumber = '0015'
) Rec0015
ON stage.EmployeeNumber = Rec0015.EmployeeNumber
) as flat
GROUP BY
flat.EmployeeNumber
I get the following result (which is correct):
EmployeeNumber EmployeeEarningsDetail
xxxx521 "{
""EmployeeEarningsDetail"": [{
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}]
}" ....
When I run the "full" query:
SELECT
flat2.EmployeeNumber,
ARRAY_AGG(STRUCT(
Rec0025.RecordNumberDed,
Rec0025.EmployeePayDeductionAmount
))
as EmployeeDeductionsDetail
FROM
(SELECT
flat.EmployeeNumber,
ARRAY_AGG(STRUCT(
flat.RecordNumberEarn,
flat.EmployeeEarningsAmount
))
as EmployeeEarningsDetail
FROM
(SELECT DISTINCT
stage.EmployeeNumber,
Rec0015.RecordNumberEarn,
Rec0015.EmployeeEarningsAmount,
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging stage
INNER JOIN
(SELECT EmployeeNumber,
SPLIT(VariableData, '~')[SAFE_OFFSET(0)] AS RecordNumberEarn,
SPLIT(VariableData, '~')[SAFE_OFFSET(4)] AS EmployeeEarningsAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging
WHERE RecordTypeNumber = '0015'
) Rec0015
ON stage.EmployeeNumber = Rec0015.EmployeeNumber
) as flat
GROUP BY
flat.EmployeeNumber
) as flat2
INNER JOIN
(SELECT DISTINCT EmployeeNumber,
SPLIT(VariableData, '~')[SAFE_OFFSET(0)] AS RecordNumberDed,
SPLIT(VariableData, '~')[SAFE_OFFSET(5)] AS EmployeePayDeductionAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging
WHERE RecordTypeNumber = '0025'
) Rec0025
ON flat2.EmployeeNumber = Rec0025.EmployeeNumber
GROUP BY
flat2.EmployeeNumber
I get the following result (which is correct, but doesn't include the first array structure):
EmployeeNumber EmployeeDeductionsDetail
xxxx521 "{
""EmployeeDeductionsDetail"": [{
""RecordNumberDed"": ""0001"",
""EmployeePayDeductionAmount"": "" 50.65""
}, {
""RecordNumberDed"": ""0002"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0003"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0004"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0005"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0006"",
""EmployeePayDeductionAmount"": "" 61.54""
}, {
""RecordNumberDed"": ""0007"",
""EmployeePayDeductionAmount"": "" 13.22""
}, {
""RecordNumberDed"": ""0008"",
""EmployeePayDeductionAmount"": "" 7.84""
}, {
""RecordNumberDed"": ""0009"",
""EmployeePayDeductionAmount"": "" 0.69""
}, {
""RecordNumberDed"": ""0010"",
""EmployeePayDeductionAmount"": "" 5.00""
}]
}" ...
However, in the 2nd query, I really want to do a group by on the array structure EmployeeEarningsDetail, but when I add the array to the select and group by I get the error:
"Grouping by expressions of type ARRAY is not allowed."
I tried adding a TO_JSON_STRING(EmployeeEarningsDetail) in both the Select and the Group by, but I got a column of just the string not as an array as below:
SELECT
flat2.EmployeeNumber,
TO_JSON_STRING(flat2.EmployeeEarningsDetail),
ARRAY_AGG(STRUCT(
Rec0025.RecordNumberDed,
Rec0025.EmployeePayDeductionAmount
))
as EmployeeDeductionsDetail
FROM
(SELECT
flat.EmployeeNumber,
ARRAY_AGG(STRUCT(
flat.RecordNumberEarn,
flat.EmployeeEarningsAmount
))
as EmployeeEarningsDetail
FROM
(SELECT DISTINCT
stage.EmployeeNumber,
Rec0015.RecordNumberEarn,
Rec0015.EmployeeEarningsAmount,
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging stage
INNER JOIN
(SELECT EmployeeNumber,
SPLIT(VariableData, '~')[SAFE_OFFSET(0)] AS RecordNumberEarn,
SPLIT(VariableData, '~')[SAFE_OFFSET(4)] AS EmployeeEarningsAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging
WHERE RecordTypeNumber = '0015'
) Rec0015
ON stage.EmployeeNumber = Rec0015.EmployeeNumber
) as flat
GROUP BY
flat.EmployeeNumber
) as flat2
INNER JOIN
(SELECT DISTINCT EmployeeNumber,
SPLIT(VariableData, '~')[SAFE_OFFSET(0)] AS RecordNumberDed,
SPLIT(VariableData, '~')[SAFE_OFFSET(5)] AS EmployeePayDeductionAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging
WHERE RecordTypeNumber = '0025'
) Rec0025
ON flat2.EmployeeNumber = Rec0025.EmployeeNumber
GROUP BY
flat2.EmployeeNumber,
TO_JSON_STRING(flat2.EmployeeEarningsDetail)
The results were (not correct at all) are below the JSON shows an f0_ string of all the earnings and the first Deduction followed by another row with the rest of the deductions in an array See below:
EmployeeNumber f0_ EmployeeDeductionsDetail
xxxx521 "[{""RecordNumberEarn"":""0001"",""EmployeeEarningsAmount"":"" 375.52""},{""RecordNumberEarn"":""0002"",""EmployeeEarningsAmount"":"" 387.26""}]" "{
""EmployeeDeductionsDetail"": [{
""RecordNumberDed"": ""0001"",
""EmployeePayDeductionAmount"": "" 50.65""
}, {
""RecordNumberDed"": ""0002"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0003"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0004"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0005"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0006"",
""EmployeePayDeductionAmount"": "" 61.54""
}, {
""RecordNumberDed"": ""0007"",
""EmployeePayDeductionAmount"": "" 13.22""
}, {
""RecordNumberDed"": ""0008"",
""EmployeePayDeductionAmount"": "" 7.84""
}, {
""RecordNumberDed"": ""0009"",
""EmployeePayDeductionAmount"": "" 0.69""
}, {
""RecordNumberDed"": ""0010"",
""EmployeePayDeductionAmount"": "" 5.00""
}]
}" ...
I have tried to put both ARRAY_AGGs in the same SELECT my arrays are cartesian products:
The SQL is:
SELECT
flat.EmployeeNumber,
ARRAY_AGG(STRUCT(
flat.RecordNumberEarn,
flat.EmployeeEarningsAmount
))
as EmployeeEarningsDetail,
ARRAY_AGG(STRUCT(
flat.RecordNumberDed,
flat.EmployeePayDeductionAmount
))
as EmployeeDeductionsDetail
FROM
(SELECT DISTINCT
stage.EmployeeNumber,
Rec0015.RecordNumberEarn,
Rec0015.EmployeeEarningsAmount,
Rec0025.RecordNumberDed,
Rec0025.EmployeePayDeductionAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging stage
INNER JOIN
(SELECT EmployeeNumber,
SPLIT(VariableData, '~')[SAFE_OFFSET(0)] AS RecordNumberEarn,
SPLIT(VariableData, '~')[SAFE_OFFSET(4)] AS EmployeeEarningsAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging
WHERE RecordTypeNumber = '0015'
) Rec0015
ON stage.EmployeeNumber = Rec0015.EmployeeNumber
INNER JOIN
(SELECT DISTINCT EmployeeNumber,
SPLIT(VariableData, '~')[SAFE_OFFSET(0)] AS RecordNumberDed,
SPLIT(VariableData, '~')[SAFE_OFFSET(5)] AS EmployeePayDeductionAmount
FROM bq_hpm_ppm_dev.EmployeeGrossToNetPayStaging
WHERE RecordTypeNumber = '0025'
) Rec0025
ON stage.EmployeeNumber = Rec0025.EmployeeNumber
) as flat
GROUP BY
flat.EmployeeNumber
and the results are:
EmployeeNumber EmployeeEarningsDetail EmployeeDeductionsDetail
xxxx521 "{
""EmployeeEarningsDetail"": [{
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0001"",
""EmployeeEarningsAmount"": "" 375.52""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}, {
""RecordNumberEarn"": ""0002"",
""EmployeeEarningsAmount"": "" 387.26""
}]
}" "{
""EmployeeDeductionsDetail"": [{
""RecordNumberDed"": ""0001"",
""EmployeePayDeductionAmount"": "" 50.65""
}, {
""RecordNumberDed"": ""0002"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0003"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0004"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0005"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0006"",
""EmployeePayDeductionAmount"": "" 61.54""
}, {
""RecordNumberDed"": ""0007"",
""EmployeePayDeductionAmount"": "" 13.22""
}, {
""RecordNumberDed"": ""0008"",
""EmployeePayDeductionAmount"": "" 7.84""
}, {
""RecordNumberDed"": ""0009"",
""EmployeePayDeductionAmount"": "" 0.69""
}, {
""RecordNumberDed"": ""0010"",
""EmployeePayDeductionAmount"": "" 5.00""
}, {
""RecordNumberDed"": ""0001"",
""EmployeePayDeductionAmount"": "" 50.65""
}, {
""RecordNumberDed"": ""0002"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0003"",
""EmployeePayDeductionAmount"": "" 44.15""
}, {
""RecordNumberDed"": ""0004"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0005"",
""EmployeePayDeductionAmount"": "" 10.33""
}, {
""RecordNumberDed"": ""0006"",
""EmployeePayDeductionAmount"": "" 61.54""
}, {
""RecordNumberDed"": ""0007"",
""EmployeePayDeductionAmount"": "" 13.22""
}, {
""RecordNumberDed"": ""0008"",
""EmployeePayDeductionAmount"": "" 7.84""
}, {
""RecordNumberDed"": ""0009"",
""EmployeePayDeductionAmount"": "" 0.69""
}, {
""RecordNumberDed"": ""0010"",
""EmployeePayDeductionAmount"": "" 5.00""
}]
}" ...
Any suggestions on how I can "fix" this.
Thanks, David
A workaround for GROUP BY
when the type does not support aggregation, eg STRUCT
or GEOGRAPHY
is to turn it to a STRING
key. TO_JSON_STRING
can do it for STRUCT, ST_AsText
can do it for GEOGRAPHY, etc. Note this might not be very performant.
To get the original value, without any modification, use ANY_VALUE
aggregation function - it picks arbitrary value from its inputs, here all the values are presumably the same, so we don't care which one.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.