简体   繁体   English

PHP正则表达式匹配多个部分

[英]PHP regex match multiple pieces

I am new to regex and I know the basics of how to pull out one sub string from a given string but I am struggling to get out multiple parts that I need. 我是regex的新手,我了解如何从给定字符串中拉出一个子字符串的基础知识,但我正在努力找出需要的多个部分。 I am wondering if someone could help me with this simple example and then I work my way from there. 我想知道是否有人可以通过这个简单的例子帮助我,然后我从那儿开始工作。 Take this string: 取这个字符串:

LMJ won Neu . LMJ 赢得了 Neu Zone - KEN # 55 LEIGH vs LMJ # 63 ONEIL 区- KEN#55 LEIGH VS LMJ#63·奥尼尔

The parts in italics are the parts of the string that will change and bold will stay the same in every string. 斜体部分是字符串中将更改的部分,粗体在每个字符串中均保持不变。 The parts I need out are: 我需要的部分是:

  1. First team id which in this case is LMJ , this will always start the string and be 3 uppercase letters, ^[AZ]{3} ? 第一队的ID在这种情况下为LMJ ,它将始终以字符串开头,并且为3个大写字母^[AZ]{3}

  2. The Neu part which could be one of 3 strings, Neu , Off , Def , [Neu|Off|Def] ? Neu部分可以是3个字符串之一, NeuOffDef[Neu|Off|Def]

  3. The second team part which will come always after the word Zone - , [AZ]{3} ? 团队的第二部分将始终出现在Zone -[AZ]{3}一词之后?

  4. Need the numeric part of the string after the first # . 在第一个#之后需要字符串的数字部分。 This could be 1 or 2 digits [0-9]{1,2} ? 这可能是1或2位数字[0-9]{1,2}

5.Third team part same as 3 except will appear after vs , [AZ]{3} ? 5.除将在vs [AZ]{3}之后出现之外,第三部分与3相同。

  1. Same as 4 need numeric part after 2nd # , [0-9]{1,2} ? 相同第二后4需要数字部分#[0-9]{1,2}

I would like to put that all together into one regex is that possible? 我想将所有这些放到一个正则表达式中吗?

Everything inside square brackets is a so-called character class: it matches only a single character. 方括号内的所有内容都是所谓的字符类:它仅匹配一个字符。 so, [Neu|Off|Def] means: exactly one of the characters N , e , u , | 因此, [Neu|Off|Def]表示:恰好是字符Neu| , O , f or D (repetitions are ignored) OfD (忽略重复)

What you want is a capture group: (Neu|Off|Def) 您想要的是捕获组: (Neu|Off|Def)

Putting it together: 把它放在一起:

^([A-Z]{3}) won (Neu|Off|Def)\. Zone - ([A-Z]{3}) #([0-9]{1,2}) [A-Z]+ vs ([A-Z]{3}) #([0-9]{1,2}) [A-Z]+$

(This assumes you're not interested in the "LEIGH" and "ONEIL" parts, and these are always in upper case letters) (这假定您对“ LEIGH”和“ ONEIL”部分不感兴趣,并且这些部分始终以大写字母表示)

The regex should be something like; 正则表达式应类似于:

'/([A-Z]{3})\ won\ (Neu|Off|Def)\.\ Zone\ -\ ([A-Z]{3})\ (\#[0-9]{1,2}\ \w+)\ vs\ ([A-Z]{3})\ (\#[0-9]{1,2}\ \w+)/' 

() are used for capturing the different parts. ()用于捕获不同部分。

This is not tested properly. 没有正确测试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM