简体   繁体   中英

Extract shortcode from Instagram URL

I try to extract the shortcode from Instagram URL

Here what i have already tried but i don't know how to extract when they are an username in the middle. Thank you a lot for your answer.

Instagram pattern : /p/shortcode/

https://regex101.com/r/nO4vdd/1/

https://www.instagram.com/p/BxKRx5CHn5i/
https://www.instagram.com/p/BxKRx5CHn5i/?utm_source=ig_share_sheet&igshid=znsinsart176
https://www.instagram.com/p/BxKRx5CHn5i/
https://www.instagram.com/username/p/BxKRx5CHn5i/

expected : BxKRx5CHn5i

You could prepend an optional (?:\\/\\w+)? non capturing group.

Note that \\w also matches _ and \\d so the capturing group could be updated to ([\\w-]+) and the forward slash in the non capturing group might also be written as just /

^(?:https?:\/\/)?(?:www\.)?(?:instagram\.com(?:\/\w+)?\/p\/)([\w-]+)(?:\/)?(\?.*)?$

Regex demo

You don't have to escape the backslashes if you use a different delimiter than / . Your pattern might look like:

^(?:https?://)?(?:www\.)?(?:instagram\.com(?:/\w+)?/p/)([\w-]+)/?(\?.*)?$

I took you original query and added a .* bafore the \\/p\\/

This gave a query of ^(?:https?:\\/\\/)?(?:www\\.)?(?:instagram\\.com.*\\/p\\/)([\\d\\w\\-_]+)(?:\\/)?(\\?.*)?$

This would be simpler assuming the username always follows the /p/

^(?:.*\\/p\\/)([\\d\\w\\-_]+)

This expression might also work:

^https?:\/\/(?:www\.)?instagram\.com\/[^\/]+(?:\/[^\/]+)?\/([^\/]{11})\/.*$

Test

$re = '/^https?:\/\/(?:www\.)?instagram\.com\/[^\/]+(?:\/[^\/]+)?\/([^\/]{11})\/.*$/m';
$str = 'https://www.instagram.com/p/BxKRx5CHn5i/
https://www.instagram.com/p/BxKRx5CHn5i/?utm_source=ig_share_sheet&igshid=znsinsart176
https://www.instagram.com/p/BxKRx5CHn5i/
https://www.instagram.com/username/p/BxKRx5CHn5i/';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

foreach ($matches as $match) {
    var_export($match[1]);
}

The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.

Assuming that you aren't simply trusting /p/ as the marker before the substring, you can use this pattern which will consume one or more of the directories before your desired substring.

Notice that \\K restarts the fullstring match, and effectively removes the need to use a capture group -- this means a smaller output array and a shorter pattern.

Choosing a pattern delimiter like ~ which doesn't occur inside your pattern alleviates the need to escape the forward slashes. This again makes your pattern more brief and easier to read.

If you do want to rely on the /p/ substring, then just add p/ before my \\K .

Code: ( Demo )

$strings = [
    "https://www.instagram.com/p/BxKRx5CHn5i/",
    "https://www.instagram.com/p/BrODg5XHlE6/?utm_source=ig_share_sheet&igshid=znsinsart176",
    "https://www.instagram.com/p/BxKRx5CHn5i/",
    "https://www.instagram.com/username/p/BxE5PpZhoa9/",
    "https://www.instagram.com/username/p/BxE5PpZhoa9/#look=overhere"
];

foreach ($strings as $string) {
    echo preg_match('~(?:https?://)?(?:www\.)?instagram\.com(?:/[^/]+)*/\K\w+~', $string , $m) ? $m[0] : '';
    echo " (from $string)\n";
}

Output:

BxKRx5CHn5i (from https://www.instagram.com/p/BxKRx5CHn5i/)
BrODg5XHlE6 (from https://www.instagram.com/p/BrODg5XHlE6/?utm_source=ig_share_sheet&igshid=znsinsart176)
BxKRx5CHn5i (from https://www.instagram.com/p/BxKRx5CHn5i/)
BxE5PpZhoa9 (from https://www.instagram.com/username/p/BxE5PpZhoa9/)
BxE5PpZhoa9 (from https://www.instagram.com/username/p/BxE5PpZhoa9/#look=overhere)

If you are implicitly trusting the /p/ as the marker and you know that you are dealing with instagram links, then you can avoid regex and just cut out the 11-character-substring, 3-characters after the marker.

Code: ( Demo )

$strings = [
    "https://www.instagram.com/p/BxKRx5CHn5i/",
    "https://www.instagram.com/p/BrODg5XHlE6/?utm_source=ig_share_sheet&igshid=znsinsart176",
    "https://www.instagram.com/p/BxKRx5CHn5i/",
    "https://www.instagram.com/username/p/BxE5PpZhoa9/",
    "https://www.instagram.com/username/p/BxE5PpZhoa9/#look=overhere"
];

foreach ($strings as $string) {
    $pos = strpos($string, '/p/');
    if ($pos === false) {
        continue;
    }
    echo substr($string, $pos + 3, 11);
    echo " (from $string)\n";
}

(Same output as previous technique)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM