[英]How to return difference in string values from the same column by doing a grouped string comparison in bigquery sql?
I have a table of products with a lot of products with an example like this:我有一个产品表,其中包含很多产品,例如:
product![]() |
brand![]() |
---|---|
colgate smile 250gr![]() |
colgate![]() |
colgate fresh breath 250gr![]() |
colgate![]() |
colgate mint 250gr![]() |
colgate![]() |
relx pod pro mango - 1pod ![]() |
relx![]() |
relx pod pro lychee - 1pod ![]() |
relx![]() |
soju jinro chamisul green grape 360ml![]() |
jinro![]() |
soju jinro chamisul strawberry 360ml![]() |
jinro![]() |
soju jinro chamisul apple grape 360ml![]() |
jinro![]() |
into进入
product![]() |
brand![]() |
word![]() |
---|---|---|
colgate smile 250gr![]() |
colgate![]() |
smile![]() |
colgate fresh breath 250gr![]() |
colgate![]() |
fresh breath![]() |
colgate mint 250gr![]() |
colgate![]() |
mint![]() |
relx pod pro mango - 1pod ![]() |
relx![]() |
mango![]() |
relx pod pro lychee - 1pod ![]() |
relx![]() |
lychee![]() |
soju jinro chamisul green grape 360ml![]() |
jinro![]() |
green grape![]() |
soju jinro chamisul strawberry 360ml![]() |
jinro![]() |
strawberry![]() |
soju jinro chamisul apple 360ml![]() |
jinro![]() |
apple![]() |
I want to group by brand and get difference in string and return that as new column.我想按品牌分组并获得字符串的差异并将其作为新列返回。 How do I do a transformation?
我该如何进行转型? and check for regexp_contains(str_1, str_2_split)=false and return the value?
并检查 regexp_contains(str_1, str_2_split)=false 并返回值?
Consider below naïve approach考虑以下幼稚的方法
So, query would look like below因此,查询如下所示
with common_words as (
select brand,
r'' || array_to_string(array(
select word
from t.words word
group by word
having count(*) = cnt
), '|') words
from (
select brand, count(*) cnt, array_concat_agg(words) words
from (
select brand, array(
select distinct word
from unnest(split(product, ' ')) word
) words
from your_table
)
group by brand
) t
)
select product, brand,
regexp_replace(trim(regexp_replace(product, words, '')), r'\s+', ' ') as diff
from your_table
join common_words
using (brand)
if applied to sample data in your question - output is如果应用于您问题中的示例数据 - output 是
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.