简体   繁体   中英

Only replace part of regex match

I am looking to match all strings that have the combination _[[ or ]]_

That part I got down: (_\\[\\[)|(\\]\\]_)
Now comes the part where I need help though, how do I replace only the underscore in these instances?

In other words, the string: "_[[2, verb//substantiv//adjektiv]]_" would result in the string: "[[2, verb//substantiv//adjektiv]]"

Appreciate any help I can get.

The solution you can use here is to simply match the entire pattern and replace it with the same pattern without the enclosing underscores ( _ ).

I created the example here btw.

Example:

$str = 'My _[[string to parse]]_ with some _[[examples]]_';
$parsed = preg_replace('/_\[\[([^(\]\]_)]*?)\]\]_/', "[[$1]]", $str);
echo $parsed;

Output:

My [[string to parse]] with some [[examples]]

Regex explained:

  • _\\[\\[ the starting point of the sequence you want to capture
  • ([^((\\]\\]_))]*?) captures the contents of what is between the opening and closing sequence that is not the closing sequence itself
  • \\]\\]_ the closing sequence

By matching the entire pattern and capturing the contents using a capture group you can replace the pattern entirely with a new substring that includes the contents from the matched pattern.

This is done in the second argument to preg_replace which is "[[$1]]"

$1 here stands for the captured group and contains its contents, which will be interpolated between two sets of square brackets.

Since the pattern also matches the underscores ( _ ) however, these are also removed but simply not replaced by anything in the second argument.

You could come up with:

$regex = '~
              _\[{2}  # look for an underscore and two open square brackets
              ([^]]+) # capture anything that is not a closing bracket
              \]{2}_  # followed by two closing square brackets and an underscore
          ~x';        # free space mode for this explanation
$string = "_[[2, verb//substantiv//adjektiv]]_";

# in the match replace [[(capture Group 1)]]
$new_string = preg_replace($regex, "[[$1]]", $string);
// new_string = [[2, verb//substantiv//adjektiv]]

See a demo on regex101.com as well as on ideone.com .

If you want to

match all strings that have the combination _[[ or ]]_

You can use this regex :

^(?=.*_\[\[).+|(?=.*\]\]_).+$

^               // start of the string
(?=.*_\[\[)     // if the string contains _[[
.+              // get the entire string (if the assert is correct)
|               // OR operands (if the assert is not correct, let's check the following)
(?=.*\]\]_)     // if the string contains ]]_
.+              // get the entire string
$               // end of the string

Demo here

I'm using this pattern just as an example.
The goal here is to use the capturing parenthesis. If the pattern matches, you will find your captured string in index n°1 in the matches array.

Example :

    $pattern = '#_(\[\[[0-9]+\]\])_#';
    $result  = preg_match_all($pattern, '_[[22555]]_ BLA BLA _[[999]]_', $matches);

    if (is_int($result) && $result > 0) {
        var_dump($matches[1]);
    }

OUTPUT

array(2) {
  [0]=>
  string(9) "[[22555]]"
  [1]=>
  string(7) "[[999]]"
}

Try using your pattern to capture the brackets [] and replacing the matches with what you captured, something like this:

$pattern = "/_(\[\[)|(\]\])_/";
$test =  "_[[2, verb//substantiv//adjektiv]]_";
$replace = preg_replace( $pattern ,"$1$2", $test );
echo $replace;

Dollar sign $ allows you to back reference what you captured with the parenthesis. $1 means the first capture group, in this case (\\[\\[) ,which means the first pair of brackets, $2 references the second pair of brackets. Because your pattern uses the | operator, only one of your capture groups will have a match, the other one will be empty.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM