I'm trying to parse a partial URL path into 3 groups.
eg
I've tried the following (and other slight variations of the same) but just can't get one pattern to suit all the possible inputs.
^/?(Jobs|Docs)/(.*)/(.+\\..+)?$
- Works for 1, not 2 or 3
^/?(Jobs|Docs)/(.*)/?(.+\\..+)?$
- Works for 2, not 1 or 3
For info:
Regex : \\/?(Jobs|Docs)(?:\\/(.+)(?=\\/|$))?(?:\\/?([^\\.]+\\.[az]+))?
Output :
Match 1
Full match 0-48 `/Jobs/STU0001/Folder1/Sub Folder A/File Name.txt`
Group 1. 1-5 `Jobs`
Group 2. 6-34 `STU0001/Folder1/Sub Folder A`
Group 3. 35-48 `File Name.txt`
Match 2
Full match 49-71 `/Docs/Another File.doc`
Group 1. 50-54 `Docs`
Group 3. 55-71 `Another File.doc`
Match 3
Full match 72-86 `/Docs/Folder 2`
Group 1. 73-77 `Docs`
Group 2. 78-86 `Folder 2`
Another one:
^/(Docs|Jobs)(?:/([^.\n]*))?(?:/([^/\n]+\.[^/\n]+))?$
Broken apart:
^
Start of line
/
Initial slash
(Docs|Jobs)
Captures the first directory
(?:/([^.\\n]*))?
Matches a slash and captures the folder part.
(?:/([^/\\n]+\\.[^/\\n]+))?
Matches a slash and captures the filename part.
$
End of string
The directory part can basically contain anything excepts periods and line feeds.
The filename part must contain three parts - 1) a filename not containing slashes or line feeds, 2) a period, and 3) an extension not containing slashes or line feeds.
Both are optional.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.