Python 将数据添加到空 pd.Dataframe

Question

I'm quite new to python, the thing I'm trying to do is get data from an website and add a part of the webpage to and pandas dataframe.我对 python 很陌生，我想做的是从网站获取数据并将网页的一部分添加到 pandas dataframe。

This is the code I got already but I'm getting an error when adding data to the Dataframe.这是我已经得到的代码，但是在将数据添加到 Dataframe 时出现错误。

The Code I got:我得到的代码：

url = 'https://oldschool.runescape.wiki/w/Module:Exchange/Anglerfish/Data'
r = requests.get(url)

soup = BeautifulSoup(r.content, 'html.parser')

price_data = soup.find_all('span', class_='s1')
df = pd.DataFrame()

for data in price_data:
  a = pd.DataFrame(data.text.split(":")[0],data.text.split(":")[1])
  df.append(a)

print(df)

The Error I'm Getting:我得到的错误：

ValueError                                Traceback (most recent call last)
<ipython-input-33-963d51917cf2> in <module>()
 10 
 11 for data in price_data:
---> 12   a = pd.DataFrame(data.text.split(":")[0],data.text.split(":")[1])
 13   df.append(a)
 14 

/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
507                 )
508             else:
--> 509                 raise ValueError("DataFrame constructor not properly called!")
510 
511         NDFrame.__init__(self, mgr, fastpath=True)

ValueError: DataFrame constructor not properly called!

Answer 1

It seems that the data structure you get from data.text.split(":")[0],data.text.split(":")[1] does not suit what is expected from the function pd.DataFrame() .您从data.text.split(":")[0],data.text.split(":")[1]获得的数据结构似乎不符合 function pd.DataFrame()的预期. First take a look at the documentation of the function to fully understand what is expecting and how to properly pass data to it.首先查看 function 的文档，以充分了解预期内容以及如何正确地将数据传递给它。 You can either pass a dictionary with the column name and the values (arrays must be of equal length, or an index should be specified), or lists/arrays as YOBEN_S proposed, for example:您可以传递包含列名和值的字典（数组必须等长，或者应指定索引），或者YOBEN_S建议的列表/数组，例如：

a = pd.DataFrame({'Column_1':data.text.split(":")[0],'Column_2':data.text.split(":")[1]})

Since you are dealing with html data, you should try a different approach using pandas.read_html() which can be read here for more information由于您正在处理html数据，因此您应该尝试使用pandas.read_html()的不同方法，可以在此处阅读以获取更多信息

Answer 2

Fix your code by通过以下方式修复您的代码

pd.DataFrame([[data.text.split(":")[0],data.text.split(":")[1]]])

Answer 3

I did some more research, the best way for me to do it was:我做了一些更多的研究，对我来说最好的方法是：

#get data from marketwatch

url = 'https://oldschool.runescape.wiki/w/Module:Exchange/Anglerfish/Data'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
price_data = soup.find_all('span', class_='s1')
df = pd.DataFrame(columns=['timestamp', 'price'])

for data in price_data:
  df = df.append({'timestamp': data.text.split(":")[0], 'price': data.text.split(":")[1]}, ignore_index=True)

print(df)

Python 将数据添加到空 pd.Dataframe

问题描述

3 个解决方案

解决方案1
1 2020-07-09 00:25:05

解决方案2
0 2020-07-09 00:23:29

解决方案3
0 已采纳 2020-07-09 09:48:27

Python 将数据添加到空 pd.Dataframe

问题描述

3 个解决方案

解决方案1 1 2020-07-09 00:25:05

解决方案2 0 2020-07-09 00:23:29

解决方案3 0 已采纳 2020-07-09 09:48:27

解决方案1
1 2020-07-09 00:25:05

解决方案2
0 2020-07-09 00:23:29

解决方案3
0 已采纳 2020-07-09 09:48:27