如何使用python解析所有html代码？

Question

I need to parse a part of HTML from a site with authorization. 我需要通过授权从网站解析HTML的一部分。 But when I try to do it, my script can't find all tags this part : 但是，当我尝试执行此操作时，我的脚本在这部分找不到所有标签：

<tbody>              
    <td class="ng-binding">name</td>
    <td class="ng-binding">name</td>
    <td class="ng-binding">name</td>
    <td class="ng-binding">name</td>
    <td></td>
</tr><!-- end ngIf: bsks -->
<!-- ngIf: (bsks | size)>0 --><tr class="bsstr ng-scope" ng-if="(bsks | size)>0">
    <td></td>
    <td></td>
    <td></td>
    <td><b class="ng-binding">сумма</b></td>
    <td></td>
</tr><!-- end ngIf: (bsks | size)>0 -->
<!-- ngIf: (bsks | size) === 0 -->
<!-- ngRepeat: item in bsks | orderBy: date --><!-- ngIf: (bsks | size) > 0 --><tr class="bsstr ng-scope" ng-repeat="item in bsks | orderBy: date" ng-if="(bsks | size) > 0">
    <td>

I am a beginner , please help me to parse this part of cite How can I get all tags that I need? 我是初学者，请帮助我分析cite的这一部分如何获取所需的所有标签？ The site has another page for authorization ( url = self.BASE_URL + 'api/v1/login/auth?info=1' ) 该网站还有另一个授权页面（ url = self.BASE_URL + 'api/v1/login/auth?info=1' ）

class Auth:
    BASE_URL = 'http.............'

    def auth(self):
        params = {
            'user': u'g1625719',
            'pass': u'472001',
            'from_site': 1,
            'dev': u'16e753be3dc097354e3328e47c3701a9'
        }
        session = requests.Session()
        url = self.BASE_URL + 'api/v1/login/auth?info=1'
        r = session.post(url, params)
        print(r.text)

    def get_url(self):
        url = self.BASE_URL + '#!/line/cart/checklist/'
        print(url)
        response = urllib.request.urlopen(url)
        return response.read()

    def parse(self):
        soup = BeautifulSoup(self.get_url(), 'html.parser')
        table = soup.body.find('div', {'class': 'example-animate-container'})
        print(table)

It is work incorrect. 工作不正确。

Answer 1

Try using find_all ( https://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-the-tree ) 尝试使用find_all（ https://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-the-tree ）

class Auth:
    BASE_URL = 'http.............'

    def auth(self):
        params = {
            'user': u'g1625719',
            'pass': u'472001',
            'from_site': 1,
            'dev': u'16e753be3dc097354e3328e47c3701a9'
        }
        session = requests.Session()
        url = self.BASE_URL + 'api/v1/login/auth?info=1'
        r = session.post(url, params)
        print(r.text)

    def get_url(self):
        url = self.BASE_URL + '#!/line/cart/checklist/'
        print(url)
        response = urllib.request.urlopen(url)
        return response.read()

    def parse(self):
        soup = BeautifulSoup(self.get_url(), 'html.parser')
        table = soup.body.find_all('div', {'class': 'example-animate-container'})
        print(table)

如何使用python解析所有html代码？

问题描述

1 个解决方案

解决方案1
0 2018-03-14 23:02:46

如何使用python解析所有html代码？

问题描述

1 个解决方案

解决方案1 0 2018-03-14 23:02:46

解决方案1
0 2018-03-14 23:02:46