简体   繁体   English

URL 提取器 Python

[英]URL Extractor Python

I want to scrape many URLs from a website.我想从网站上抓取许多 URL。 It's no login required but I don't get the expected result in distance and it returns me an empty list.不需要登录,但我没有得到预期的distance结果,它返回一个空列表。

code:代码:

from bs4 import BeautifulSoup as bs
import requests

page = requests.get('https://maktabkhooneh.org/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AE%D9%88%D8%B4%D8%A2%D9%85%D8%AF%DB%8C%D8%AF-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86/')
soup = bs(page.text, 'html.parser')

distance = soup.find_all('div', attrs={"class": "filler js-collapsible__body "})
print(distance)

part of page inspect:页面检查的一部分:

<div class="desktop-unit-nav__chapter js-collapsible__title" data-collapsible-id="3364">
            <div class="ellipsis">فصل اول: مقدمه</div>
            <i class="faq__icon svg-icon--20 svg-icon--gunmetal js-collapsible__toggler-icon is-rotated-180"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="1000" viewBox="0 0 24 24">
    <g fill="none" fill-rule="evenodd" transform="rotate(90 8 14.5)">
        <path d="M11.307 11.536l-8.842 9.026a1.419 1.419 0 0 1-2.037 0 1.492 1.492 0 0 1 0-2.079l7.824-7.987L.428 2.51a1.492                        1.492 0 0 1 0-2.08 1.42 1.42 0 0 1 2.037 0l8.842 9.027c.281.287.422.663.422 1.04 0 .376-.14.752-.422 1.039z"></path>
    </g>
</svg>
</i>
        </div>
        <div class="filler js-collapsible__body  js-collapsible__body--active " data-collapsible-id="3364">
            
                
                <a title="خوش آمدید به یادگیری ماشین " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AE%D9%88%D8%B4%D8%A2%D9%85%D8%AF%DB%8C%D8%AF-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--violet"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center color-violet">خوش آمدید به یادگیری ماشین </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "01:28</div>

                </a>
            
                
                <a title="مقدمه " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D9%85%D9%82%D8%AF%D9%85%D9%87/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">مقدمه </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "07:04</div>

                </a>
            
                
                <a title="یادگیری ماشین چیست " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-%DA%86%DB%8C%D8%B3%D8%AA/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">یادگیری ماشین چیست </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "07:23</div>

                </a>
            
                
                <a title="یادگیری نظارت شده " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%86%D8%B8%D8%A7%D8%B1%D8%AA-%D8%B4%D8%AF%D9%87/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">یادگیری نظارت شده </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "12:39</div>

                </a>
            
                
                <a title="یادگیری نظارت نشده " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%86%D8%B8%D8%A7%D8%B1%D8%AA-%D9%86%D8%B4%D8%AF%D9%87/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">یادگیری نظارت نشده </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "14:23</div>

                </a>
            
        </div>

    
        <div class="desktop-unit-nav__chapter js-collapsible__title" data-collapsible-id="3359">
            <div class="ellipsis">فصل دوم: رگرسیون خطی تک متغیره</div>
            <i class="faq__icon svg-icon--20 svg-icon--gunmetal js-collapsible__toggler-icon "><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="1000" viewBox="0 0 24 24">
    <g fill="none" fill-rule="evenodd" transform="rotate(90 8 14.5)">
        <path d="M11.307 11.536l-8.842 9.026a1.419 1.419 0 0 1-2.037 0 1.492 1.492 0 0 1 0-2.079l7.824-7.987L.428 2.51a1.492                        1.492 0 0 1 0-2.08 1.42 1.42 0 0 1 2.037 0l8.842 9.027c.281.287.422.663.422 1.04 0 .376-.14.752-.422 1.039z"></path>
    </g>
</svg>
</i>
        </div>
        <div class="filler js-collapsible__body " data-collapsible-id="3359">
            
                
                <a title="ایجاد مدل " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%AF%D9%88%D9%85-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C-%D8%AA%DA%A9-%D9%85%D8%AA%D8%BA%DB%8C%D8%B1%D9%87-ch3359/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%A7%DB%8C%D8%AC%D8%A7%D8%AF-%D9%85%D8%AF%D9%84/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">ایجاد مدل </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "08:20</div>

                </a>
            
                
                <a title="تابع هزینه " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%AF%D9%88%D9%85-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C-%D8%AA%DA%A9-%D9%85%D8%AA%D8%BA%DB%8C%D8%B1%D9%87-ch3359/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AA%D8%A7%D8%A8%D8%B9-%D9%87%D8%B2%DB%8C%D9%86%D9%87/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">تابع هزینه </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "08:22</div>

                </a>
            
                
                <a title="تابع هزینه - بخش دوم " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%AF%D9%88%D9%85-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C-%D8%AA%DA%A9-%D9%85%D8%AA%D8%BA%DB%8C%D8%B1%D9%87-ch3359/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AA%D8%A7%D8%A8%D8%B9-%D9%87%D8%B2%DB%8C%D9%86%D9%87-%D8%A8%D8%AE%D8%B4-%D8%AF%D9%88%D9%85/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">تابع هزینه - بخش دوم </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "11:19</div>

                </a>
            
                
                <a title="تابع هزینه - بخش سوم " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%AF%D9%88%D9%85-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C-%D8%AA%DA%A9-%D9%85%D8%AA%D8%BA%DB%8C%D8%B1%D9%87-ch3359/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AA%D8%A7%D8%A8%D8%B9-%D9%87%D8%B2%DB%8C%D9%86%D9%87-%D8%A8%D8%AE%D8%B4-%D8%B3%D9%88%D9%85/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">تابع هزینه - بخش سوم </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "08:58</div>

                </a>
            
                
                <a title="گرادیان نزولی " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%AF%D9%88%D9%85-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C-%D8%AA%DA%A9-%D9%85%D8%AA%D8%BA%DB%8C%D8%B1%D9%87-ch3359/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%DA%AF%D8%B1%D8%A7%D8%AF%DB%8C%D8%A7%D9%86-%D9%86%D8%B2%D9%88%D9%84%DB%8C/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">گرادیان نزولی </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "11:40</div>

                </a>
            
                
                <a title="گرادیان نزولی - بخش دوم " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%AF%D9%88%D9%85-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C-%D8%AA%DA%A9-%D9%85%D8%AA%D8%BA%DB%8C%D8%B1%D9%87-ch3359/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%DA%AF%D8%B1%D8%A7%D8%AF%DB%8C%D8%A7%D9%86-%D9%86%D8%B2%D9%88%D9%84%DB%8C-%D8%A8%D8%AE%D8%B4-%D8%AF%D9%88%D9%85/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">گرادیان نزولی - بخش دوم </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "12:01</div>

                </a>
            
                
                <a title="گرادیان نزولی در رگرسیون خطی " href="/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%AF%D9%88%D9%85-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C-%D8%AA%DA%A9-%D9%85%D8%AA%D8%BA%DB%8C%D8%B1%D9%87-ch3359/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%DA%AF%D8%B1%D8%A7%D8%AF%DB%8C%D8%A7%D9%86-%D9%86%D8%B2%D9%88%D9%84%DB%8C-%D8%B1%DA%AF%D8%B1%D8%B3%DB%8C%D9%88%D9%86-%D8%AE%D8%B7%DB%8C/" class="desktop-unit-nav__unit">
                    <i class="chapter__unit-icon svg-icon--28 svg-icon--blue"><svg id="svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400" viewBox="0, 0, 400,400"><g id="svgg"><path id="path0" d="M134.000 31.924 C 43.598 61.784,-1.912 169.691,39.013 257.145 C 103.366 394.664,307.634 376.696,345.792 230.160 C 378.958 102.793,259.098 -9.395,134.000 31.924 M245.715 66.386 C 312.662 99.167,341.028 179.516,308.779 245.023 C 245.456 373.651,54.415 330.842,54.310 188.000 C 54.237 87.305,156.169 22.540,245.715 66.386 M136.000 185.773 C 136.000 235.398,137.123 276.000,138.495 276.000 C 142.317 276.000,291.674 190.281,291.848 187.988 C 291.931 186.882,264.550 170.220,231.000 150.961 C 197.450 131.703,162.350 111.356,153.000 105.746 L 136.000 95.547 136.000 185.773 M221.479 180.415 L 232.958 187.295 217.479 196.870 C 208.965 202.136,193.450 211.371,183.000 217.391 L 164.000 228.337 164.000 187.875 L 164.000 147.413 187.000 160.474 C 199.650 167.658,215.165 176.632,221.479 180.415 " stroke="none" fill="#000000" fill-rule="evenodd"></path></g></svg></i>
                    
                    <div class="color-gunmetal ellipsis--v-center ">گرادیان نزولی در رگرسیون خطی </div>
                    <div class="desktop-unit-nav__end-icon">
                        
                    </div>
                    <div class="desktop-unit-nav__unit-effort">  "10:30</div>

                </a>
            
        </div>

As you see, a tags are in div and filler js-collapsible__body classes but they are not visible in the distance variable and it returns an empty list and I can't extract URLs or href parts如您所见, a tagsdivfiller js-collapsible__body类中,但它们在距离变量中不可见,它返回一个空列表,我无法提取URLshref部分

To get all course URLs, try to specify User-Agent HTTP header:要获取所有课程 URL,请尝试指定User-Agent HTTP header:

from bs4 import BeautifulSoup as bs
import requests

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0"
}

page = requests.get(
    "https://maktabkhooneh.org/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AE%D9%88%D8%B4%D8%A2%D9%85%D8%AF%DB%8C%D8%AF-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86/",
    headers=headers,
)
soup = bs(page.text, "html.parser")

for a in soup.select('a[href^="/course/"]'):
    print("https://maktabkhooneh.org" + a["href"])

Prints:印刷:

https://maktabkhooneh.org/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/
https://maktabkhooneh.org/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AE%D9%88%D8%B4%D8%A2%D9%85%D8%AF%DB%8C%D8%AF-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86/
https://maktabkhooneh.org/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D8%AE%D9%88%D8%B4%D8%A2%D9%85%D8%AF%DB%8C%D8%AF-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86/
https://maktabkhooneh.org/course/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%B1%D8%A7%DB%8C%DA%AF%D8%A7%D9%86-%DB%8C%D8%A7%D8%AF%DA%AF%DB%8C%D8%B1%DB%8C-%D9%85%D8%A7%D8%B4%DB%8C%D9%86-Andrew-NG-mk1085/%D9%81%D8%B5%D9%84-%D8%A7%D9%88%D9%84-%D9%85%D9%82%D8%AF%D9%85%D9%87-ch3364/%D9%88%DB%8C%D8%AF%DB%8C%D9%88-%D9%85%D9%82%D8%AF%D9%85%D9%87/

...and so on.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM