简体   繁体   中英

Regular Expression to get all links with certain extensions

Im looking for a regular expression that will grab all of the urls that have the extensions int he following array:

Array
(
    [0] => mp4
    [1] => m4v
    [2] => webm
    [3] => ogv
    [4] => wmv
    [5] => flv
)

This array is returned by an internal WordPress function called wp_get_video_extensions() and are video URls that WordPress recognizes.

A block of content would look like this with URls inside it:

'Yes, but I grow at a reasonable pace,' said the Dormouse: 'not in that ridiculous fashion.' And he got up very sulkily and crossed over to the other side of the court.

All this time the Queen had never left off staring at the Hatter, and, just as the Dormouse crossed the court, she said to one of the officers of the court, 'Bring me the list of the singers in the last concert!' on which the wretched Hatter trembled so, that he shook both his shoes off.

[video mp4="http://www.example.com/files/video/video1.mp4"][/video]

'Give your evidence,' the King repeated angrily, 'or I'll have you executed, whether you're nervous or not.'

http://www.example.com/files/video/video2.flv

'I'm a poor man, your Majesty,' the Hatter began, in a trembling voice, '—and I hadn't begun my tea—not above a week or so—and what with the bread-and-butter getting so thin—and the twinkling of the tea—'

I am trying to get it to find both the video urls in there and return the entire URL in the array.

Here is what i have:

preg_match_all( '/^https?:\/\/(?:[a-z\-]+\.)+[a-z]{2,6}(?:/[^/#?]+)+\.(?:' . implode( '|', wp_get_video_extensions() ) . ')$/', $post->post_content, $matches);

And i am getting this:

Warning: preg_match_all(): Unknown modifier '['

Ideally, i would like to get this:

Array
(
    [0] => Array
           (
               [0] => http://www.example.com/files/video/video1.mp4
               [1] => http://www.example.com/files/video/video2.flv
           )
    [1] => Array
           (
               [0] => http://www.example.com/
               [1] => http://www.example.com/
           )
    [2] => Array
           (
               [0] => files/video/
               [1] => files/video/
           )
    [3] => Array
           (
               [0] => video1.mp4
               [1] => video2.flv
           )
)

But this would also be perfect as i can use parse_url() to break the rest out later on:

Array
(
    [0] => http://www.example.com/files/video/video1.mp4
    [1] => http://www.example.com/files/video/video2.flv
)

You're first problem, is that you didn't escape all the "/". The second problem is that you're trying to match only if that is the beginning and ending of the line. This should take care of it.

preg_match_all('~https?://(?:[a-z\-]+\.)+[a-z]{2,6}(?:/[^/#?]+)+\.(?:' . implode( '|', wp_get_video_extensions() ) . ')~', $post->post_content, $matches);

Using "~" makes it so you don't have to escape the "/".

I have list of urls like this

https://example.com/icon.gif https://example.com/photo.png https://example.com/icp.css. https://example.com/free

I dont want those urls which has the extension like .png, .gif, .css, .ttf.

Any way to do this ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM