简体   繁体   中英

Regex in PHP for Extract and Reformatting

I am working on a project with facebook instant article. I would like to do automatic conversion through PHP script but I have problem with reformatting this code

[caption id="attachment_15737" align="aligncenter" width="1024"]<img class="wp-image-15737 size-full" title="bathroom counter decor" src="https://roohome.com/wp-content/uploads/2017/11/ivote.jpg" alt="bathroom counter decor" width="1024" height="768" /> © ivote[/caption]

into this code

<figure><img class="wp-image-15737 size-full" title="bathroom counter decor" src="https://roohome.com/wp-content/uploads/2017/11/ivote.jpg" alt="bathroom counter decor" width="1024" height="768" /><figcaption>© ivote</figcaption></figure>

Can anyone help me out with this problem? I would really appreciate any help. Thank you.

You should probably consider using some sort of formal parser to handle this problem in the general case. That being said, if you are willing to accept the risks with just using a single regex, then consider matching with the finding pattern, and replacing with the pattern after it:

/\[caption [^<]*(<img[^>]*>)\s*([^[]*)\[\/caption\]/
<figure>$1<figcaption>$2</figcaption></figure>

Here is the code:

$input = "[caption id=\"attachment_15737\" align=\"aligncenter\" width=\"1024\"]<img class=\"wp-image-15737 size-full\" title=\"bathroom counter decor\" src=\"https://roohome.com/wp-content/uploads/2017/11/ivote.jpg\" alt=\"bathroom counter decor\" width=\"1024\" height=\"768\" /> © ivote[/caption]";
$after = preg_replace('/\[caption [^<]*(<img[^>]*>)\s*([^[]*)\[\/caption\]/', '<figure>$1<figcaption>$2</figcaption></figure>', $input);
echo $after;

This outputs the following HTML:

<figure><img class="wp-image-15737 size-full" title="bathroom counter decor"
             src="https://roohome.com/wp-content/uploads/2017/11/ivote.jpg"
             alt="bathroom counter decor" width="1024" height="768" />
        <figcaption>© ivote</figcaption>
</figure>

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM