简体   繁体   English

RegEx搜索和替换模式

[英]RegEx search and replace in pattern

I need to search out a pattern that can change from document to document but follows a certain pattern. 我需要搜索一个模式,该模式可以随文档的不同而变化,但是遵循一定的模式。 The pattern will always be 9 numbers followed by 3 letters. 模式将始终是9个数字,后跟3个字母。 It will sometimes have a space between them and sometimes not. 有时它们之间会有一个空间,有时则没有。 Here is an example of text to search through: 这是要搜索的文本示例:

  1. 009244828 FLE 009244828 FLE
  2. MID021087275 MID021087275
  3. 006386476JJK 006386476JJK
  4. 002973303 JJK 002973303 JJK
  5. MNS 000110924 MNS 000110924
  6. MNS000110924 MNS000110924
  7. 009244828PSC 009244828PSC
  8. 001915657SCR 001915657SCR

My current regex looks like this: .+?(?=(JJK|FLE|PSC|SCR)) . 我当前的正则表达式如下: .+?(?=(JJK|FLE|PSC|SCR)) This returns lines 1,3,4,7 & 8 like this:1. 这将返回第1、3、4、7和8行,如下所示:1。

  1. 009244828\\s 009244828 / s
  2. 006386476 006386476
  3. 002973303\\s 002973303
  4. 009244828 009244828
  5. 001915657 001915657

as it should but does not return the letters. 照原样,但不返回字母。 I need to return these lines with the letters and remove the space if it is there. 我需要用字母返回这些行并删除空格(如果有的话)。 my returned result should look like this: 我返回的结果应如下所示:

  1. 009244828FLE 009244828FLE
  2. 006386476JJK 006386476JJK
  3. 002973303JJK 002973303JJK
  4. 009244828PSC 009244828PSC
  5. 001915657SCR 001915657SCR

Let's compose the regex you're looking for step-by-step. 让我们逐步编写您要查找的正则表达式。

What you need to match is this: 您需要匹配的是:

  • 9 decimal digits \\d{9} 9个十进制数字\\d{9}
  • An optional whitespace character \\s? 可选的空格字符\\s?
  • 3 UPPERCASE letters [AZ]{3} (use [a-zA-Z]{3} if the letters may be lowercase) 3个大写字母[AZ]{3} (如果字母可能是小写,请使用[a-zA-Z]{3}

Putting it all together, this regex does almost what you want: 放在一起,此正则表达式几乎可以满足您的要求:

\d{9}\s?[A-Z]{3}

I said "almost" because it doesn't let you get rid of the space between the digits and the letters. 我说“差不多”是因为它不能让您摆脱数字和字母之间的空格。 To do that, you need to put the letters and the digits into capturing groups - after that, you can simply concatenate the captured substrings (with a replacing expression along the lines of $1$2 or \\1\\2 ) to get exactly what you want. 为此,您需要将字母和数字放入捕获组中-之后,您可以简单地将捕获的子字符串连接起来(使用$1$2\\1\\2的替换表达式)来获取所需的确切内容。

(\d{9})\s?([A-Z]{3})

If you want to make sure each digit-and-letter group is on its own line, simply wrap the entire regex in ^ and $ , then run it in a mode where these two characters match the beginning/end of a line instead of the beginning/end of the entire string. 如果要确保每个数字字母组都在其单独的行上,只需将整个正则表达式包装在^$ ,然后在以下两种模式下将其运行,即这两个字符与行的开头/结尾匹配,而不是整个字符串的开头/结尾。

^(\d{9})\s?([A-Z]{3})$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM