简体   繁体   English

如何在php中提取字符串的一部分

[英]how to extract a portion of a string in php

I am using preg_replace() for some string replacement. 我正在使用preg_replace()进行一些字符串替换。

$str = "<aa>Let's find the stuff qwe in between <id>12345</id> these two previous brackets</h>";

$do = preg_match("/qwe(.*)12345/", $str, $matches);

which is working just fine and gives the following result 哪个工作正常,并给出以下结果

$match[0]=qwe in between 12345
$match[1]=in between

but I am using same logic to extract from the following string. 但我使用相同的逻辑从以下字符串中提取。

<text>
  <src><![CDATA[<TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Arial" SIZE="36" COLOR="#999999" LETTERSPACING="0" KERNING="0">r1 text 1  </FONT></P></TEXTFORMAT>]]></src>
  <width>45%</width>
  <height>12%</height>
  <left>30.416666666666668%</left>
  <top>3.0416666666666665%</top>
  <begin>2s</begin>
  <dur>10s</dur>
  <transIn>fadeIn</transIn>
  <transOut>fadeOut</transOut>
  <id>E2159292994B083ACA7ABC7799BBEF3F7198FFA2</id>
</text>

I want to extract the string from 我想从中提取字符串

r1text1

to

</id>

The Regular expression I currently Have is: 我目前拥有的正则表达式是:

preg_match('/r1text1(.*)</id\>/', $metadata], $matches); 

where $metadata is the above string.. 其中$ metadata是上面的字符串..

$matches does not return anything.... For some reason...how do i do it? $ match不会返回任何内容....出于某种原因......我该怎么做? Thanks in advance 提前致谢

If you want to extract the text, you will probably want to use preg_match . 如果要提取文本,可能需要使用preg_match The following might work: 以下可能有效:

preg_match('#\<P[^\>]*\>\<FONT[^\>]*\>(.*\</id\>)#', $string, $matches)

Whatever gets matched in the parantheses can be found later in the $matches array. 无论在parantheses中匹配的是什么,都可以在$matches数组中找到。 In this case everything between a <P> tag followed by a <FONT> tag and </id> , including the latter. 在这种情况下, <P>标签后跟一个<FONT>标签和</id> ,包括后者。

Above regex is untested but might give you a general idea of how to do it. 以上正则表达式未经测试,但可能会让您大致了解如何执行此操作。 Adapt if your needs are a bit different :) 如果您的需求有点不同,请调整:)

Even if don't know why you would match the regex on a incomplete XML fragment (starting within a <![CDATA[ and ending right before the closing XML tag </id> , you do have three obvious problems with your regex: 即使不知道为什么你会在不完整的XML片段上匹配正则表达式(从<![CDATA[并在结束XML标记</id>之前结束</id> ,你的正则表达式确实存在三个明显的问题:

  1. As Amri said: you have to escape the / character in the closing XML tag because you use / as the pattern delimiter. 正如Amri所说:你必须转义结束XML标记中的/字符,因为你使用/作为模式分隔符。 By the way, you don't have to escape the > character. 顺便说一下,你不必逃避>字符。 That gives you: '/r1text1(.*)<\\/id>/' Alternatively you can change the pattern delimiter to # for example: '#r1text1(.*)</id>#' (I will use the first pattern to further develop the expression). 这给你: '/r1text1(.*)<\\/id>/' '#r1text1(.*)</id>#' '/r1text1(.*)<\\/id>/'或者你可以将模式分隔符更改为#例如: '#r1text1(.*)</id>#' (我将使用第一个模式进一步发展表达)。

  2. As Rich Adams already said: the text in your example data is " r1_text_1 " ( _ is a space character) but you match against '/r1text1(.*)<\\/id>/' . 正如Rich Adams已经说过:示例数据中的文本是“ r1_text_1 ”( _是空格字符),但是您匹配'/r1text1(.*)<\\/id>/' You have to include the spaces in your regex or allow for a uncertain number of spaces, such as '/r1(?:\\s*)text(?:\\s*)1(.*)<\\/id>/' (the ?: is the syntax for non-capturing subpatterns) 您必须在正则表达式中包含空格或允许不确定数量的空格,例如'/r1(?:\\s*)text(?:\\s*)1(.*)<\\/id>/'?:是非捕获子模式的语法)

  3. The . . (dot) in your regex does not match newlines by default. 正则表达式中的(点)默认情况下与换行符不匹配。 You have to add the s (PCRE_DOTALL) pattern modifier to let the . 你必须添加s (PCRE_DOTALL)模式修饰符才能让. (dot) match against newlines as well: '/r1(?:\\s*)text(?:\\s*)1(.*)<\\/id>/s' (点)与换行符匹配: '/r1(?:\\s*)text(?:\\s*)1(.*)<\\/id>/s''/r1(?:\\s*)text(?:\\s*)1(.*)<\\/id>/s''/r1(?:\\s*)text(?:\\s*)1(.*)<\\/id>/s' )1(。*)/ '/r1(?:\\s*)text(?:\\s*)1(.*)<\\/id>/s'

you probably need to parse your string/file and extract the value between the FONT tag. 您可能需要解析字符串/文件并提取FONT标记之间的值。 Then insert the value into the id tag 然后将值插入id标记

Try googling for php parsing. 尝试谷歌搜索PHP解析。

try this 试试这个

preg_match('/r1text1(.*)<\/id\>/', $metadata], $matches);

You are using / as the pattern delimiter but your content has / in . 您使用/作为模式分隔符,但您的内容具有/ in。 You can use \\ as the escape character. 您可以使用\\作为转义字符。

In the sample you have "r1 text 1 ", yet your regular expression has "r1text1". 在示例中,您有“r1 text 1”,但您的正则表达式具有“r1text1”。 The regular expression doesn't match because there are spaces in the string you are trying to match it against. 正则表达式不匹配,因为您尝试将其匹配的字符串中有空格。 You should include the spaces in the regular expression. 您应该在正则表达式中包含空格。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM