简体   繁体   English

在 python(使用 bs4)中的多个非统一表中返回一次数据迭代?

[英]Return one iteration of data across multiple, non-uniform tables in python (with bs4)?

I'm trying to iterate through a series of tables with data that isn't entered uniformly across them.我正在尝试遍历一系列表,其中包含未在它们之间统一输入的数据。 I'm using python & bs4.我正在使用 python 和 bs4。

These tables contain legislation information in them - when they were sent to the governor (or not), when they were signed (or not), when they were vetoed (or not).这些表格中包含立法信息——何时发送给州长(或未发送)、何时签署(或未签署)、何时被否决(或否)。

I only want to return the following: (delivery/sign/vetoed & date).我只想返回以下内容:(交货/签字/否决和日期)。 Or, (not delivered/not signed/not vetoed & no date).或者,(未交付/未签名/未否决且无日期)。 Problem is, when I iterate through the lists it goes through every row and returns results for every row.问题是,当我遍历列表时,它会遍历每一行并返回每一行的结果。 I want the code to stop when either scenario is fulfilled and only return on pair of results.我希望代码在满足任一场景时停止并且只返回一对结果。 See below:见下文:

        tablebody=soup.select_one(".table.c-bill--actions-table > tbody")
        for item in tablebody.select("td"):
            
            if "delivered to governor" in item.text:
                transfer_list.append("delivered to governor")
                transfer_list.append(item.find_previous("td").text)
            else:
                transfer_list.append("not delivered")
                transfer_list.append("no date")
                
        tablebody=soup.select_one(".table.c-bill--actions-table > tbody")
        for item in tablebody.select("td"):
            
            if "signed" in item.text:
                transfer_list.append("signed")
                transfer_list.append(item.find_previous("td").text)
                
            else:
                transfer_list.append("not signed")
                transfer_list.append("no date")
                             
        
        tablebody=soup.select_one(".table.c-bill--actions-table > tbody")
        for item in tablebody.select("td"):
            
            if "vetoed" in item.text:
                transfer_list.append("vetoed")
                transfer_list.append(item.find_previous("td").text)
                
            else:
                transfer_list.append("not vetoed")
                transfer_list.append("no date")

My current output looks like this:我当前的output看起来像这样:

 ['senate        Bill S3984',
  'Creates the crime of related use of a lethal or explosive device',
  'Liz Krueger',
  '(D, WF) 28th\xa0Senate District',
  'No votes for this bill.',
  'A2645',
  'not delivered',
  'no date',
  'not delivered',
  'no date',
  'not delivered',
  'no date',
  'not delivered',
  'no date',
  'not signed',
  'no date',
  'not signed',
  'no date',
  'not signed',
  'no date',
  'not signed',
  'no date',
  'not vetoed',
  'no date',
  'not vetoed',
  'no date',
  'not vetoed',
  'no date',
  'not vetoed',
  'no date'],

Any thoughts?有什么想法吗? I only need one iteration of (delivered,signed,vetoed) and the accompanying (date/no-date).我只需要一次迭代(交付、签名、否决)和随附的(日期/无日期)。

It's because it's checking each <td> item.这是因为它正在检查每个<td>项目。 There's a few ways to do it.有几种方法可以做到这一点。

Just get all the items into a list and see if the item is present.只需将所有项目放入列表中,然后查看该项目是否存在。 Another way you could do it is put in additional logic checks:另一种方法是进行额外的逻辑检查:

   tablebody=soup.select_one(".table.c-bill--actions-table > tbody")
   
   check_list = [item.text.strip() for item in tablebody.select("td")]

    if "delivered to governor" in check_list:
        transfer_list.append("delivered to governor")
        i = check_list.index("delivered to governor")
        transfer_list.append(check_list[i+1])
        
    else:
        transfer_list.append("not delivered")
        transfer_list.append("no date")
        
    if "signed" in check_list:
        transfer_list.append("signed")
        i = check_list.index("signed")
        transfer_list.append(check_list[i+1])
        
    else:
        transfer_list.append("not signed")
        transfer_list.append("no date")
        
    if "vetoed" in check_list:
        transfer_list.append("vetoed")
        i = check_list.index("vetoed")
        transfer_list.append(check_list[i+1])
        
    else:
        transfer_list.append("not vetoed")
        transfer_list.append("no date")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM