Web抓取到数据框。添加列时出现错误

Question

My error: "list indices must be integers or slices, not list" 我的错误：“列表索引必须是整数或切片，而不是列表”

I know that there are seemingly endless amounts of posts related to that error, but I've searched and can't figure it out. 我知道与该错误相关的帖子似乎不胜枚举，但我进行了搜索，无法弄清楚。 If there is a solution that I missed and will help me, please let me know. 如果有我错过的解决方案可以为我提供帮助，请告诉我。

Anyways... 无论如何...

I'm using pandas to web scrape stock information into a dataframe, then add two calculated columns to the end: 我正在使用熊猫将网页上的股票信息抓取到一个数据框中，然后在最后添加两个计算列：

to calculate the price 计算价格
another unrelated calc 另一个不相关的计算

The issues I'm seeing is when I try to add the first calculated column (the last bit of code): 我看到的问题是当我尝试添加第一个计算列（代码的最后一部分）时：

# Dependencies
from bs4 import BeautifulSoup
import requests
import pandas as pd

url = 'http://www.dividend.com/dividend-stocks/preferred-dividend-stocks.php#stocks&sort_name=Symbol&sort_order=ASC&page=1'
tables = pd.read_html(url)
tables

type(tables)
type(tables[0])
tables[0].head()

tables['Perp Value'] = (1/(tables['Dividend Yield']/100))*tables['Annual Dividend']
tables[0].head()

When I try to add the column 'Perp Value' I get the error. 当我尝试添加“ Perp Value”列时，出现错误。 What do I need to add to be able to do my calc? 我需要添加什么才能进行计算？

For reference, the unformatted data looks like: 供参考，未格式化的数据如下所示：

\\nDividend Yield\\n \\nCurrent Price\\n \\nAnnual Dividend\\n \\n52-Week High\\n \\ 0 8.04% $25.65 $2.06 25.98 \\ n股息收益率\\ n \\ n当前价格\\ n \\ n年度股息\\ n \\ n52周最高\\ n \\ 0 8.04％$ 25.65 $ 2.06 25.98
1 7.61% $25.47 $1.94 25.95 1 7.61％$ 25.47 $ 1.94 25.95
2 6.66% $25.82 $1.72 28.80 2 6.66％$ 25.82 $ 1.72 28.80
3 7.47% $25.95 $1.94 26.87 3 7.47％$ 25.95 $ 1.94 26.87
4 5.78% $25.72 $1.49 28.99 4 5.78％$ 25.72 $ 1.49 28.99
5 8.06% $26.20 $2.11 26.00 5 8.06％$ 26.20 $ 2.11 26.00
6 7.72% $23.87 $1.84 0.00 6 7.72％$ 23.87 $ 1.84 0.00
7 7.75% $23.80 $1.84 0.00 7 7.75％$ 23.80 $ 1.84 0.00
8 7.80% $24.05 $1.88 0.00 8 7.80％$ 24.05 $ 1.88 0.00
9 9

Answer 1

Try this: 尝试这个：

Find column names: 查找列名：

list(tables[0])

['\nStock Symbol\n',
'\nCompany Name\n',
'\nDividend Yield\n',
'\nCurrent Price\n',
'\nAnnual Dividend\n',
'\n52-Week High\n',
'\n52-Week Low\n']

Clean data and convert to numeric: 清除数据并转换为数字：

a = tables[0][list(tables[0])[4]]
a = a.replace('[\$,]', '', regex=True).astype(float)
b = tables[0][list(tables[0])[2]]
b = b.replace('[\%,]', '', regex=True).astype(float)

Output: 输出：

(1/a/100)*b

0     0.039029
1     0.039227
2     0.038721
3     0.038505
4     0.038792
5     0.038199
6     0.041957
7     0.042120
8     0.041489
9     0.041649
10    0.038594
11    0.039053
12    0.038795
13    0.037853
14    0.037546
15    0.039100
16    0.039320
17    0.039898
18    0.038431
19    0.040290
dtype: float64

Assign: 分配：

tables[0]["Perp Value"] = (1/a/100)*b

Web抓取到数据框。添加列时出现错误

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-08-19 23:05:23

Try this: 尝试这个：

Web抓取到数据框。 添加列时出现错误

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-08-19 23:05:23

Try this: 尝试这个：

Web抓取到数据框。添加列时出现错误

解决方案1
0 已采纳 2018-08-19 23:05:23