简体   繁体   中英

How to extract link from html script in python?

How can I extract the URL from a script of HTML with Python?
The HTML provided:


function download() {
                window.open('https:somelink.com');
        }
        const text = `<div style=\'position: relative;padding-bottom: 56.25%;height: 0;overflow: hidden;\'>
<iframe allowfullscreen=\'allowfullscreen\' src=\'URL\' style=\'border: 0;height: 100%;left: 0;position: absolute;top: 0;width: 100%;\' ></iframe>
</div>`;

function embed() {
                var element = document.getElementById('embed-text');
                console.log(element);
                element.innerHTML = text

        }

Desired output will be:

https://somelink.com

Any help will do. Thanks!

You should use regex like this:

 var urlRegex = /(https?:\/\/[^\s]+)/; // the regex // your string var input = "<div style=\'position: relative;padding-bottom: 56.25%;height: 0;overflow: hidden;\'><iframe allowfullscreen=\'allowfullscreen\' src=\" https://my-url.com/test \" style=\'border: 0;height: 100%;left: 0;position: absolute;top: 0;width: 100%;\' ></iframe></div>"; console.log(input.match(urlRegex)[1]); // use regex and lot result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM