简体   繁体   English

非重复正则表达式模式 - 负先行

[英]Non-repeating Regex Pattern - negative lookahead

I am attempting to parse a string with regex in Java that is used for dimensions and return only the required parts of it.我正在尝试使用 Java 中的正则表达式解析一个字符串,该字符串用于维度并仅返回它的必需部分。

The ideal String would be: number x number.理想的字符串是:数字 x 数字。

Anything not in this format can be ignored and return null.任何不是这种格式的都可以忽略并返回 null。

Some of the Strings that are inputted include the following though.不过,一些输入的字符串包括以下内容。

  • 123x 132 sqft 123x 132 平方英尺
  • 200 sq.ft. 200 平方英尺x 310 sq.ft. x 310 平方英尺
  • 404X931X1007X1140 404X931X1007X1140
  • .772 Acres .772 英亩
  • 680 and 3209.05 680 和 3209.05
  • 0.772 AC 0.772 交流电
  • approx 255 by 640大约 255 x 640
  • 111'X301' 111'X301'
  • approx.2 acre 2英亩

My current regex solution is this我目前的正则表达式解决方案是这样的

"(\\d+(?:\\.\\d*)?)[^\\dxX]*(?:[xX]| and |by|\\*)[^\\dxX]*(\\d+(?:\\.\\d*)?)"

and i return match.group(1) + "x" + match.group(2)然后我返回 match.group(1) + "x" + match.group(2)

The problem I am left with is these repeating ones like "404X931X1007X1140" This should also be returned as a null since its an irregular shape but instead returns 404x931我留下的问题是这些重复的,如“404X931X1007X1140”这也应该作为 null 返回,因为它的形状不规则,而是返回 404x931

My question is how would I make sure not to include these?我的问题是我如何确保不包括这些? My thought was to append a negative lookahead but it fails to meet my expectations and returns 404x93 for some reason.我的想法是对 append 进行负面前瞻,但它未能达到我的预期并出于某种原因返回 404x93。

first expression + "\\D*(?!([xX]| and |by|\\*)\\d+)"

Incase anyone else is looking for this.万一其他人正在寻找这个。 I ended up figuring out a solution that worked.我最终找到了一个有效的解决方案。 I would have just used \b at the end but it didn't work for * characters.我会在最后使用 \b 但它不适用于 * 字符。 And the {0,30} in the lookbehind is because java wont let me do infinite quantifiers in a lookbehind.而后视中的 {0,30} 是因为 java 不会让我在后视中做无限量词。 Kind of a mess to look at though.虽然看起来有点乱。

(?<!\\d(?:[xX]| and |by|\\*).{0,30})\\b(\\d+(?:,\\d+)*(?:\\.\\d+)?)[^\\dxX]*(?:[xX]| and |by|\\*)[^\\dxX]*(\\d+(?:,\\d+)*(?:\\.\\d+)?)(?!.*(?:[xX]| and |by|\\*)\\D*\\d+)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM