简体   繁体   中英

Splitting words and HTML tags of a string by preg_split

I want to split a string containing HTML tags into words. Since HTML tags are not words, they should be captured separately. I started with

$str='this is <a href="">something for test</a>';

print_r(preg_split('# (?![^<]{1,99}[>])#',$str));

Array
(
    [0] => this
    [1] => is
    [2] => <a href="">something
    [3] => for
    [4] => test</a>
)

to maintain the HTML tag structure, but I need an extra splitting to separate the HTML tags to produce

Array
(
    [0] => this
    [1] => is
    [2] => <a href="">
    [3] => something
    [4] => for
    [5] => test
    [6] => </a>
)
$str = 'this is <a href="">something for test</a>';

print_r(preg_split("/(<[^>]*[^\/]>)| /i", $str,-1,PREG_SPLIT_DELIM_CAPTURE+PREG_SPLIT_NO_EMPTY));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM