简体   繁体   中英

BigQuery Collation

How can I set a collation order in BigQuery?

I want something like this

SELECT Place
FROM Locations
ORDER BY Place COLLATE "en_CA"

I can't find any documentation other than COLLATE is a reserved word in BigQuery.

BigQuery is sorting the following Strings in [a..zA..Z] order:

Eg

  • ant
  • bee
  • cat
  • Apple
  • Banana
  • Cantaloupe

Is there a way to ask BigQuery to sort in [aA..zZ] order?

  • ant
  • Apple
  • bee
  • Banana
  • cat
  • Cantaloupe

Below example is for BigQuery Standard SQL

#standardSQL
create temp function collate_order(text string) as ((
  select string_agg(chr(1000 * ascii(lower(c)) - ascii(c)), '' order by offset)
  from unnest(split(text)) c with offset
));
with `project.dataset.Locations` as (
  select 'ant' as Place union all
  select 'Apple' union all
  select 'bee' union all
  select 'apple' union all
  select 'cat' union all
  select 'Banana' union all
  select 'Cantaloupe' 
)
select Place
from `project.dataset.Locations`
order by collate_order(Place)

with output

在此处输入图像描述

Forgot to mention - obviously you can extend this approach to handle unicode text by replacing ascii to unicode function

You can try following query it will work for your requirement, it will sort data in [aA..zZ] order:-

SELECT Place
FROM Locations
ORDER BY upper(Place)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM