I was trying to used the LISTAGG
function in SQL and I facing the following error:
Invalid operation: Result size exceeds LISTAGG limit Details:
----------- error: Result size exceeds LISTAGG limit code: 8 ...
How do I get rid of this error?
Please refer to ListAgg functio documentation at https://docs.aws.amazon.com/redshift/latest/dg/r_LISTAGG.html
The return data type is varchar(max) that is 64K varchar size
The error you described is exactly mentioned in the official documentation
You can think of using ListAgg() function with Distinct as follows to reduce the items to be concatenated
select listagg(distinct sellerid, ', ') within group (order by sellerid) from sales
where eventid = 4337;
This is the reason for the issue we encountered.
Here is the SQL query I tried to execute,
SELECT DISTINCT "year_level", LISTAGG("value", ', ') WITHIN GROUP (ORDER BY "year_level") OVER (PARTITION BY "year_level")
FROM "school_1__acara_db"."acara_data_set";
and this is the error I got.
ERROR: Result size exceeds LISTAGG limit Detail: ----------------------------------------------- error: Result size exceeds LISTAGG limit code: 8001 context: LISTAGG limit: 65535 query: 4360256 location: string_ops.cpp:116 process: query1_127_4360256 [pid=1793] -----------------------------------------------
Let's break this issue into small segments. So as they mentioned I've exceeded the maximum limit of LISTAGG
. We can find it with the below SQL query which values are exceeded in the "value"
column according to the "year_level"
.
SELECT "year_level", SUM(OCTET_LENGTH("value")) as total_bytes
FROM "school_1__acara_db"."acara_data_set"
GROUP BY "year_level"
ORDER BY total_bytes;
This is the output.
OCTET_LENGTH returns the length of the string in bytes(octets).
As you can see values related to the Primary Ungraded
have total bytes of 50329
and Secondary Ungraded
have 61178
bytes. Both aren't exceeding the VARCHAR(MAX)
, the 65535
limit. At least can I get LISTAGG
values for the above two records? This is the query I'm going to execute,
SELECT DISTINCT "year_level", LISTAGG("value", ', ') WITHIN GROUP (ORDER BY "year_level") OVER (PARTITION BY "year_level")
FROM "school_1__acara_db"."acara_data_set"
WHERE "year_level" IN ('Primary Ungraded', 'Secondary Ungraded');
I got the same error, Result size exceeds LISTAGG limit Detail: -----------
. As we can see in the above result it didn't exceed the VARCHAR(MAX)
, the 65535
limit. But why? Let's see the "value"
column count related to the "year_level"
with the below query.
SELECT "year_level", COUNT("value") as total_counts
FROM "school_1__acara_db"."acara_data_set"
GROUP BY "year_level"
ORDER BY total_counts;
This is the output.
Before moving to the further explanation, let's see how LISTAGG
works.
In Redshift, LISTAGG
can be used as an Aggregate function or a Window function and it transforms data from multiple rows into a single list of values separated by a specified delimiter. For the below example, the delimiter is ,
(comma with spaces).
The following image is taken from the Oracle's Listagg Function - Uses and Duplicate Removal article and it is related to the Oracle, but can get the basic concepts of the LISTAGG
function.
That is how the data is merged with the delimiter.
Even our query also used the delimiter as ,
(comma with spaces). Let's get the first record in the below image as an example.
"Primary Ungraded"
has 3412
records consist of 50329
of total bytes. So this means we are going to merge 3412
records into a single column. When we merge it should have 50329
of total bytes. But we are not merging directly, we are merging with the delimiter. So there are 3411
delimiters in-between 3412
records.
delimiter_count = no_of_records - 1
If not understand please check the Example #1
image of how data is merged. So according to this when we run our last failed query without a delimiter, it should work.
SELECT DISTINCT "year_level", LISTAGG("value", '') WITHIN GROUP (ORDER BY "year_level") OVER (PARTITION BY "year_level")
FROM "school_1__acara_db"."acara_data_set"
WHERE "year_level" IN ('Primary Ungraded', 'Secondary Ungraded')
Yes, it's working fine. But this won't work for other records, because without doing anything they already exceeded the VARCHAR(MAX)
, the 65535
limit.
A lot of people suggest using the DISTINCT
keyword with the LISTAGG
function. Both Aggregate function and Window function support DISTINCT
keyword as an optional.
LISTAGG( [DISTINCT] aggregate_expression [, 'delimiter' ] )
[ WITHIN GROUP (ORDER BY order_list) ]
LISTAGG( [DISTINCT] expression [, 'delimiter' ] )
[ WITHIN GROUP (ORDER BY order_list) ]
OVER ( [PARTITION BY partition_expression] )
Without duplication, I have data which are exceeding the VARCHAR(MAX)
, the 65535
limit. So I can't use it in my case.
Can't we slice the data into smaller parts? Yes, we can do it with the below query, don't be confused and the below solution was found by one of my teammates' name Isuru.
SELECT year_level, num_of_parts, listagg(value,',') AS listagg_data FROM (
SELECT year_level, value, total_bytes / 60000 AS num_of_parts FROM (
SELECT year_level, value, SUM(OCTET_LENGTH(value)) OVER (PARTION BY year_level ORDER BY value ROWS UNBOUNDED PRECEDING) AS total_bytes
FROM "school_1__acara_db"."acara_data_set"
)
)
GROUP BY year_level, num_of_parts
ORDER BY year_level, num_of_parts;
This is the output.
Using this you can grab all the information you want. Here we are slicing the total_bytes
by 60000
and can see how many pieces were broken in the num_of_parts
column. 'Primary Ungraded'
didn't slice into any part and 'Secondary Ungraded' has been sliced into 2 parts as we investigated earlier likewise it has been sliced into multiple parts by 60000
.
The issue we encountered is a database limitation. So programmatically we can merge the LISTAGG
values into a single place. At the moment I don't see any other solution or couldn't find any proper solution throughout the internet.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.