Langaue(Python3.X, Re library)
I have a string as follows
import re
query_string = r'SELECT "a"."name", "a"."create_date", "a"."state", SUM("b"."cost") AS "amount", SUM("b"."cost") FILTER (WHERE "a"."state" = 'UNPAID') AS "paid", SUM("b"."cost") FILTER (WHERE "a"."state" = 'PAID') AS "unpaid" FROM "maintenance"'
I want to select "column names" ie "a"."name", "a"."create_date", "a"."state"
. from above string.
Which comes between "SELECT" and "SUM(.*)"
Any help appreciated.
I have tried below Regular Expression pattern
1) r'SELECT (.* ), [^(SUM(.* )]'
2) r'SELECT (.* ), SUM(.* )'
but both are not giving accurate result
Expected result:
"a"."name", "a"."create_date", "a"."state"(No comma at the end)
Use:
(SELECT\s*)([^()]+)(,\s*SUM.*)
and use the second group with \\2
. Or replace groups \\1
and \\3
with nothing.
Test here .
You can use
(?:SELECT\s*)(.*?)(?:,\s*SUM.*)
to create a single capturing group.
The two (?:...)
make non-capturing groups.
The (.*?)
is a non-greedy group, which will stop before the FIRST "SUM" instead of the last one.
You can use:
sql = '''SELECT "a"."name", "a"."create_date", "a"."state", SUM("b"."cost") AS "amount", SUM("b"."cost") FILTER (WHERE "a"."state" = 'UNPAID') AS "paid", SUM("b"."cost") FILTER (WHERE "a"."state" = 'PAID') AS "unpaid" FROM "maintenance"'''
res = re.search(r'SELECT (.+?), SUM', sql)
print(res.group(1))
Output:
"a"."name", "a"."create_date", "a"."state"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.