I need to extract an ID specified in URLs that have this structure:
https://trello.com/c/iGjJLqwr/1-test-project
in the above example I want to extract:
iGjJLqwr
I need to use the regex expression in Zapier that according to the documentation uses Python regex
The following Python regex somehow is in the right direction but it still returns too much:
[^https://trello.com/c/][\w]+
returns 3 matches:
Match 1
Full match 21-29 iGjJLqwr
Match 2
Full match 31-36 -test
Match 3
Full match 36-44 -project
I need to restrict the result to:
iGjJLqwr
The following regex returns an extra forward slash
[^https://trello.com/c/]\w+/
Match 1
Full match 21-30 iGjJLqwr/
Square brackets [ ... ]
create a character set that selects one of any of the characters they contain. If a carat is added at the beginning, [^ ... ]
, this set is negated. The pattern does not consider the full, continuous string within the brackets.
In other words, [aaabbc]
is equivalent to [abc]
(and even [cba]
).
If you just want to capture the first path element after https://trello.com/c/
in a group, you can use this pattern:
https://trello\\.com/c/([^/]+).*
Demo: https://regex101.com/r/99FDJS/2
If you want the pattern to only match this substring within the URL, you can use positive lookahead and lookbehind:
(?<=https://trello\\.com/c/).+?(?=/.*)
This will match the ID without the extra forward slash:
import re
string = 'https://trello.com/c/iGjJLqwr/1-test-project'
match = re.search(r'[^https://trello.com/c/]\w*(?=/)', string)
print(match.group(0))
iGjJLqwr
The (?=/)
asserts that the next character is a forward slash.
In your pattern you use a character class which matches only one out of several characters. Starting with a ^
will make it a negated character class which matches any character that is not in the character class.
Since the character class is not followed by a quantifier, this [^https://trello.com/c/]
will match a single i
or -
and then \\w+
will match 1+ times a word character.
That will give you the matches iGjJLqwr
, -test
and -project
I think you meant to match the id in a capturing group:
^https://trello\.com/c/(\w+)
About the pattern
^
Assert start of the string https://trello\\.com/c/
Match literally https://trello.com/c/
(\\w+)
Capture in group 1 matching 1+ times a word character
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.