简体   繁体   English

从字符串解析目录

[英]Parse directories from a string

Firstly i have spent Three hours trying to solve this. 首先,我花了三个小时试图解决这个问题。 Also please don't suggest not using regex. 另外,请不要建议不要使用正则表达式。 I appreciate other comments and can easily use other methods but i am practicing regex as much as possible. 我很欣赏其他评论,可以轻松使用其他方法,但是我正在尽可能地练习正则表达式。

I am using VB.Net 我正在使用VB.Net

Example string: 示例字符串:

"Hello world this is a string C:\Example\Test E:\AnotherExample"

Pattern: 图案:

"[A-Z]{1}:.+?[^ ]*"

Works fine. 工作正常。 How ever what if the directory name contains a white space? 如果目录名包含空格怎么办? I have tried to match all strings that start with 1 uppercase letter followed by a colon then any thing else. 我试图匹配所有以1个大写字母开头,后跟冒号然后是其他任何东西的字符串。 This needs to be matched up until a whitespace, 1 upper letter and a colon. 需要对此进行匹配,直到出现空格,1个大写字母和一个冒号为止。 But then match the same sequence again. 但是然后再次匹配相同的序列。

Hope i have made sense. 希望我有道理。

How about "[AZ]{1}:((?![AZ]{1}:).)*", which should stop before the next drive letter and colon? 怎么样“ [AZ] {1}:((?![AZ] {1}:)。)*”,应该在下一个驱动器号和冒号之前停止?

That "?!" 那“?!” is a "negative lookaround" or "zero-width negative lookahead" which, according to Regular expression to match a line that doesn't contain a word? 根据正则表达式匹配不包含单词的行的“负环顾”或“零宽度负向超前” is the way to get around the lack of inverse matching in regexes. 是解决正则表达式缺乏逆匹配的方法。

Not to be too picky, but most filesystems disallow a small number of characters (like <>/\\:?"), so a correct pattern for a file path would be more like [AZ]:\\\\((?![AZ]{1}:)[^<>/:?"])* . 不必太挑剔,但是大多数文件系统都不允许使用少量字符(如<> / \\ :?“),因此文件路径的正确模式应更像[AZ]:\\\\((?![AZ]{1}:)[^<>/:?"])*

The other important point that has been raised is how you expect to parse input like "hello path is c:\\folder\\file.extension this is not part of the path:P"? 提出的另一个重要点是您希望如何解析“ hello路径为c:\\ folder \\ file.extension,这不是path:P的一部分”之类的输入? This is a problem you commonly run into when you start trying to parse without specifying the allowed range of inputs, or the grammar that a parser accepts. 当您开始尝试分析时未指定允许的输入范围或解析器接受的语法时,通常会遇到此问题。 This particular problem seems pretty ad hoc and so I don't really expect you to come up with a grammar or to define how particular messages are encoded. 这个特定问题似乎是临时性的,因此我并不希望您提出语法或定义特定消息的编码方式。 But the next time you approach a parsing problem, see if you can first define what messages are allowed and what they mean (syntax and semantics). 但是,下次您遇到解析问题时,请查看是否可以首先定义允许的消息及其含义(语法和语义)。 I think you'll find that once you've defined the structure of allowed messages, parsing can be almost trivial. 我认为您会发现,一旦定义了允许的消息的结构,解析几乎是微不足道的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM