简体   繁体   中英

Parse directories from a string

Firstly i have spent Three hours trying to solve this. Also please don't suggest not using regex. I appreciate other comments and can easily use other methods but i am practicing regex as much as possible.

I am using VB.Net

Example string:

"Hello world this is a string C:\Example\Test E:\AnotherExample"

Pattern:

"[A-Z]{1}:.+?[^ ]*"

Works fine. How ever what if the directory name contains a white space? I have tried to match all strings that start with 1 uppercase letter followed by a colon then any thing else. This needs to be matched up until a whitespace, 1 upper letter and a colon. But then match the same sequence again.

Hope i have made sense.

How about "[AZ]{1}:((?![AZ]{1}:).)*", which should stop before the next drive letter and colon?

That "?!" is a "negative lookaround" or "zero-width negative lookahead" which, according to Regular expression to match a line that doesn't contain a word? is the way to get around the lack of inverse matching in regexes.

Not to be too picky, but most filesystems disallow a small number of characters (like <>/\\:?"), so a correct pattern for a file path would be more like [AZ]:\\\\((?![AZ]{1}:)[^<>/:?"])* .

The other important point that has been raised is how you expect to parse input like "hello path is c:\\folder\\file.extension this is not part of the path:P"? This is a problem you commonly run into when you start trying to parse without specifying the allowed range of inputs, or the grammar that a parser accepts. This particular problem seems pretty ad hoc and so I don't really expect you to come up with a grammar or to define how particular messages are encoded. But the next time you approach a parsing problem, see if you can first define what messages are allowed and what they mean (syntax and semantics). I think you'll find that once you've defined the structure of allowed messages, parsing can be almost trivial.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM