简体   繁体   中英

Extract content of <Script> in Python with BeautifulSoup

I want to extract value of window. FEED__INITIAL__STATE

Piece of code

How can I do it?

Maybe you should try like this:

import requests
from bs4 import BeautifulSoup

def check_script_tag(url):

    r = requests.get(url)
    parsed_html = BeautifulSoup(r.content, features="html.parser")

    try:
        text = parsed_html.body.find('script').text
        print (text)  # Here text in script tag !!
    except AttributeError:
        print("There is no script tag !!")

check_script_tag("https://stackoverflow.com")

First, we have to find all the scripts tag and then match it,

ps - updated in RasitAydin code

import requests
from bs4 import BeautifulSoup


def check_script_tag(url):
    r = requests.get(url)
    parsed_html = BeautifulSoup(r.content, features="html.parser")

    script_tags = parsed_html.body.find_all('script')
    for script_tag in script_tags:
        text = script_tag.text
        if 'window.FEED__INITIAL__STATE'.lower() in text.lower():
            print(text)


check_script_tag(" YOUR WEB URL")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM