[英]Regular Expression to match many coordinate formats
I am working on a regex that will match many different types of of location coordinates. 我正在研究一种正则表达式,它将匹配许多不同类型的位置坐标。 So far it matches about 90% of the formats:
到目前为止,它与大约90%的格式匹配:
([SNsn][\\s]*)?((?:[\\+-]?[0-9]*[\\.,][0-9]+)|(?:[\\+-]?[0-9]+))(?:(?:[^ms'′""″,\\.\\dNEWnew]?)|(?:[^ms'′""″,\\.\\dNEWnew]+((?:[\\+-]?[0-9]*[\\.,][0-9]+)|(?:[\\+-]?[0-9]+))(?:(?:[^ds°""″,\\.\\dNEWnew]?)|(?:[^ds°""″,\\.\\dNEWnew]+((?:[\\+-]?[0-9]*[\\.,][0-9]+)|(?:[\\+-]?[0-9]+))[^dm°'′,\\.\\dNEWnew]*))))([SNsn]?)[^\\dSNsnEWew]+([EWew][\\s]*)?((?:[\\+-]?[0-9]*[\\.,][0-9]+)|(?:[\\+-]?[0-9]+))(?:(?:[^ms'′""″,\\.\\dNEWnew]?)|(?:[^ms'′""″,\\.\\dNEWnew]+((?:[\\+-]?[0-9]*[\\.,][0-9]+)|(?:[\\+-]?[0-9]+))(?:(?:[^ds°""″,\\.\\dNEWnew]?)|(?:[^ds°""″,\\.\\dNEWnew]+((?:[\\+-]?[0-9]*[\\.,][0-9]+)|(?:[\\+-]?[0-9]+))[^dm°'′,\\.\\dNEWnew]*))))([EWew]?)
Testing the formats: 测试格式:
N 45° 55.732 W 122° 29.882
N 45°55.732 W 122°29.882
N 047° 38.938', W 122° 20.887'
N 047°38.938',W 122°20.887'
40.123, -74.123
40.123,-74.123
40.123° N 74.123° W
40.123°北74.123°W
40° 7´ 22.8" N 74° 7´ 22.8" W
40°7´22.8“ N 74°7´22.8” W
40° 7.38' , -74° 7.38'
40°7.38',-74°7.38'
N40°7'22.8, W74°7'22.8"
N40°7'22.8,W74°7'22.8“
40°7'22.8"N, 74°7'22.8"W
40°7'22.8“ N,74°7'22.8” W
40 7 22.8, -74 7 22.8
40 7 22.8,-74 7 22.8
40.123 -74.123
40.123 -74.123
40.123°,-74.123°
40.123°,-74.123°
144442800, -266842800
144442800,-266842800
40.123N74.123W
40.123N74.123W
4007.38N7407.38W
4007.38N7407.38W
40°7'22.8"N, 74°7'22.8"W
40°7'22.8“ N,74°7'22.8” W
400722.8N740722.8W
400722.8N740722.8W
N 40 7.38 W 74 7.38
N 40 7.38 W 74 7.38
40:7:23N,74:7:23W
40:7:23N,74:7:23W
40:7:22.8N 74:7:22.8W
40:7:22.8N 74:7:22.8W
40°7'23"N 74°7'23"W
40°7'23“北74°7'23”西
40°7'23" -74°7'23"
40°7'23“ -74°7'23”
40d 7' 23" N 74d 7' 23" W
40d 7'23“ N 74d 7'23” W
40.123N 74.123W
40.123N 74.123瓦
40° 7.38, -74° 7.38
40°7.38,-74°7.38
Testing if it works: https://regexr.com/3ivu2 测试是否有效: https : //regexr.com/3ivu2
As you can see there are issues with the spaces and commas that are causing the regex to not match some of these formats. 如您所见,空格和逗号存在问题,导致正则表达式与其中某些格式不匹配。
I am trying to match the coordinate strings so that they can be highlighted in my iOS
app and allow the user to tap them. 我正在尝试匹配坐标字符串,以便它们可以在我的
iOS
应用中突出显示,并允许用户点击它们。
What can I do to update the regex and fix the matching issues? 我该怎么做来更新正则表达式并解决匹配问题?
I'm sure there are many ways to go about this. 我敢肯定有很多方法可以解决这个问题。 Since you haven't specified a regex engine or programming language, I'll post one that works in PCRE and what that should work in most engines.
由于您尚未指定正则表达式引擎或编程语言,因此我将发布一种适用于PCRE的引擎以及在大多数引擎中应该使用的引擎。 The PCRE regex is much easier to understand than the non-PCRE regex, but both use the exact same logic.
与非PCRE regex相比,PCRE regex更容易理解,但是两者都使用完全相同的逻辑。
The patterns defined below match each string you've presented in your question and properly separates each part of the coordinate (x, y). 下面定义的模式与问题中显示的每个字符串匹配,并正确分隔坐标的每个部分(x,y)。
This method uses the DEFINE
construct to pre-define patterns. 此方法使用
DEFINE
构造来预定义模式。 The beauty of this construct is that you can define reusable parts of your regex in one location, thus, you can edit most of the regex just by editing these subpatterns. 这种构造的优点在于,您可以在一个位置定义正则表达式的可重用部分,因此,只需编辑这些子模式即可编辑大多数正则表达式。
See regex in use here 查看正则表达式在这里使用
(?(DEFINE)
(?<ns>[ns])
(?<ew>[ew])
(?<d>[°´’'"d:])
(?<n>[+-]?\d+(?:\.\d+)?)
)
(
(?&ns)?
(?:\ ?(?&n)(?&d)?){1,3}
\ ?(?&ns)?
)
\ ?,?\ ?
(
(?&ew)?
(?:\ ?(?&n)(?&d)?){1,3}
\ ?(?&ew)?
)
Flags: gix
标志:
gix
See regex in use here 查看正则表达式在这里使用
(
[ns]?
(?:\ ?[+-]?\d+(?:\.\d+)?[°´’'"d:]?){1,3}
\ ?[ns]?
)
\ ?,?\ ?
(
[ew]?
(?:\ ?[+-]?\d+(?:\.\d+)?[°´’'"d:]?){1,3}
\ ?[ew]?
)
Flags: gix
. 标志:
gix
。
Some engines don't have the x
flag. 某些引擎没有
x
标志。 For those engines you can use the following one-liner ( as seen here ): 对于这些引擎,您可以使用以下单缸( 如此处所示 ):
([ns]?(?: ?[+-]?\d+(?:\.\d+)?[°´’'"d:]?){1,3} ?[ns]?) ?,? ?([ew]?(?: ?[+-]?\d+(?:\.\d+)?[°´’'"d:]?){1,3} ?[ew]?)
Since both patterns are essentially the same (non-PCRE is just an expanded version of the PCRE), I'll define the PCRE regex pattern since it's easier to grasp. 由于两种模式本质上是相同的(非PCRE只是PCRE的扩展版本),我将定义PCRE regex模式,因为它更容易掌握。
Note that the patterns that use x
have escaped spaces since they would otherwise be ignored ( x
ignores whitespace within the pattern). 请注意,使用
x
的模式已转义了空格,因为否则它们将被忽略( x
忽略模式中的空白)。 The i
flag allows us to match text regardless of case ( i
makes our pattern case-insensitive). i
标志使我们能够匹配大小写的文本( i
使我们的模式不区分大小写)。
(?(DEFINE)...)
The DEFINE
group is completely ignored by regex. (?(DEFINE)...)
DEFINE
组被正则表达式完全忽略。 It gets treated as a var name=value
, whereas you can recall the specific pattern for use via its name. name=value
,而您可以通过其名称来调用要使用的特定模式。 (?<ns>[ns])
The group ns
matches any character in the set nsNS
(?<ns>[ns])
组ns
匹配集合nsNS
中的任何字符 (?<ew>[ew])
The group ew
matches any character in the set ewEW
(?<ew>[ew])
组ew
匹配集合ewEW
中的任何字符 (?<d>[°´''"d:])
The group d
matches any character in the set °´''"d:
(?<d>[°´''"d:])
组d
匹配集合°´''"d:
中的任何字符°´''"d:
(?<n>[+-]?\\d+(?:\\.\\d+)?)
The group n
matches any number that matches the following structure (?<n>[+-]?\\d+(?:\\.\\d+)?)
组n
匹配与以下结构匹配的任何数字
[+-]?
Optionally match any character in the set +-
+-
\\d+
Match one or more digits \\d+
匹配一个或多个数字 (?:\\.\\d+)?
Optionally match a decimal point followed by one or more digits The pattern is composed of 3 larger parts. 图案由3个较大的部分组成。 The first and last are capture groups (the coordinates themselves) and the second is what separates the two.
第一个和最后一个是捕获组(坐标本身),第二个是将两者分开的对象。
(?&ns)?
Optionally match the group ns
ns
(?:\\ ?(?&n)(?&d)?){1,3}
Matches [an optional space, followed by the group n
then optionally group d
] between one and three times (?:\\ ?(?&n)(?&d)?){1,3}
匹配1至3次[可选空格,后跟n
组,然后可选地是d
组] \\ ?(?&ns)?
Optionally match a space, optionally match the group ns
ns
\\ ?,?\\ ?
Match an optional space, comma and space (this separates each coordinate part) ns
with the group ew
ew
组替换ns
组 This simplified regex literally matches all the patterns you've given: 这个简化的正则表达式实际上符合您提供的所有模式:
^((?:[NW]? ?(?:[-\d.d]+[NW:°´’'",]?[ NW]?)+[, ]*)+[NW]?)$
I'm not an expert for coordinates, but you can modify it easily if I didn't take into account some specifics. 我不是坐标专家,但是如果我不考虑某些细节,可以轻松修改它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.