简体   繁体   English

python 2.7:从网站上抓取表格

[英]python 2.7: scraping tables from a website

I am probably doing my scraping incorrectly given I know little programming but I would like to know how I scrape data from an html table in python and associate it with its own class...I don't really know what Im doing so here is an example:鉴于我对编程知之甚少,我可能正在错误地进行抓取,但我想知道我如何从 python 中的 html 表中抓取数据并将其与它自己的类相关联......我真的不知道我在这里做什么一个例子:

<div class="example">
    <a href="/example/thisexample">
      <span class="name">Product name</span>
    </a>
      <table>
        <tbody>
          <tr class="odd"> Some data </tr>
          <tr class="even"> Some data </tr>
          <tr class="odd"> Some data </tr>
          <tr class="even"> Some data </tr>
          <tr class="odd"> More data</tr>
        </tbody>
      </table>
</div>

So far Im able collect the data using lxml and place it in a list, however, the webpage contains many classes (like example) and all have different tables with more or less rows than above.到目前为止,我能够使用 lxml 收集数据并将其放入列表中,但是,该网页包含许多类(例如示例),并且所有表都具有比上面更多或更少行的不同表。 I would like the data from these tables to be associated with the class aka here the product name... Sorry if this makes little sense, I am new to this and havent touched python except for an intro class a couple years ago我希望这些表中的数据与类(这里是产品名称)相关联……对不起,如果这没有意义,我是新手,除了几年前的介绍类之外,我还没有接触过 python

You said you store the data in lists, but you wanted them to be associated with the classes you get from the HTML?您说您将数据存储在列表中,但您希望它们与您从 HTML 中获得的类相关联? If I am understanding correctly, store them as a dictionary:如果我理解正确,请将它们存储为字典:

stuff = {}东西 = {}

stuff['class name #1'] = ['data thing #1 from table in class', 'data thing #2 from table in class', .... 'data thing #3 from table in class'] stuff['class name #1'] = ['data thing #1 from table in class', 'data thing #2 from table in class', .... 'data thing #3 from table in class']
. .
. .
. .
stuff['class name #n'] = ....东西['类名#n'] = ....

this way your "stuff" dictionary will store the things in a relational way, thus you associated what is in what by have keys to those things这样,您的“东西”字典将以关系方式存储事物,因此您可以通过对这些事物的键将内容与内容相关联

does that make sense?那有意义吗? is that what you are asking?这就是你要问的吗?

more about dictionaries here更多关于字典在这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM