简体   繁体   中英

Regex in XML using PHP: find certain value of a certain xml tag

I am trying to get the value of a certain attribute in a certain xml tag with regex but cant get it right, maybe someone has an idea how to do it?

The xml looks like this:

<OTA_PingRQ>
  <Errors>
    <Error Code="101" Type="4" Status="NotProcessed" ShortText="Authentication refused">Authentication : login failed</Error>
  </Errors>
</OTA_PingRQ>

and id like to match only the value of the Shorttext inside the Error tag. in the end it should give me "Authentication refused" back.

What ive tried so far is using a lookbehind and lookahead, which doesnt let me take quantifiers with non fixed width. Like that (?<=<Error.).*?(?=>) . Can someone tell me how to only match the value of the shorttext (inside the error tag)?

You didn't specify the language you're using, i can give you the solution with PHP, the regex remain the same in every language anyway.

Here is the regex you're looking for:

#\<Error Code\=\"[0-9]+\" Type\=\"[0-9]+\" Status\=\"NotProcessed\" ShortText\=\"([a-z 0-9]+)\"\>#is

Concrete PHP use:

$yourOriginalString = '
<OTA_PingRQ>
  <Errors>
    <Error Code="101" Type="4" Status="NotProcessed" ShortText="Authentication refused">Authentication : login failed</Error>
  </Errors>
</OTA_PingRQ>' ;

preg_match_all('#\<Error Code\=\"[0-9]+\" Type\=\"[0-9]+\" Status\=\"NotProcessed\" ShortText\=\"([a-z 0-9]+)\"\>#im', $yourOriginalString, $result) ;
print_r($result) ;

the regex function will return an array with:

   [0] => Array
        (
            [0] => <Error Code="101" Type="4" Status="NotProcessed" ShortText="Authentication refused">
        )

    [1] => Array
        (
            [0] => Authentication refused
        )

[0] is the full match [1] list the content in the matching capturing groups: each () set in your regex

Some Regex explication:

Type\=\"[0-9]+\"

Assume "Type" can change and be any numbers.

 ShortText\=\"([a-z 0-9]+)\"

Catch a string alphanumeric + space string. If you need some other stuffs, you can update like:

*[a-z 0-9\!\-]+*

catch ! and - too

#is

Are flags and ignore = caps and line break

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM