[英]pandas add row instead of column
I'm new to pandas, but trying to simply add a row 我是熊猫的新手,但试图简单地添加一行
class Security:
def __init__(self):
self.structure = ['timestamp', 'open', 'high', 'low', 'close', 'vol']
self.df = pd.DataFrame(columns=self.structure) # index =
def whats_inside(self):
return self.df
"""
Some skipped code...
"""
def add_data(self, timestamp, open, high, low, close, vol):
data = [timestamp, open, high, low, close, vol]
self.df = self.df.append (data)
sec = Security()
print sec.whats_inside()
sec.add_data ('2015/06/01', '1', '2', '0.5', '1', '100')
print sec.whats_inside()
but the output is: 但输出是:
0 close high low open timestamp vol
0 2015/06/01 NaN NaN NaN NaN NaN NaN
1 1 NaN NaN NaN NaN NaN NaN
2 2 NaN NaN NaN NaN NaN NaN
3 0.5 NaN NaN NaN NaN NaN NaN
4 1 NaN NaN NaN NaN NaN NaN
5 100 NaN NaN NaN NaN NaN NaN
This means, I'm adding a column instead of row. 这意味着,我正在添加一列而不是一行。 Yes, I've tried to google but still didnt get the point how do make it simple pythonic way.
是的,我已经尝试谷歌,但仍然没有得到关键如何使它简单的pythonic方式。
ps I know that's simple, but I'm just missing something important. ps我知道这很简单,但我只是缺少重要的东西。
There are several ways to add a new row. 有几种方法可以添加新行。 Perhaps the easiest one is (if you want to add the row to the end) is to use
loc
: 也许最简单的是(如果你想将行添加到最后)是使用
loc
:
df.loc[len(df)] = ['val_a', 'val_b', .... ]
loc
expects an index. loc
期望索引。 len(df)
will return the number of rows in the dataframe so the new row will be added to the end of the dataframe. len(df)
将返回数据帧中的行数,因此新行将添加到数据帧的末尾。
'['val_a', 'val_b', .... ]' is a list of values of the row, in the same order of the columns, so the list's length must be equal to the number of columns, otherwise you will get a ValueError
exception. '['val_a','val_b',....]'是行的值列表,列的顺序相同,所以列表的长度必须等于列数,否则你会得到一个
ValueError
异常。 An exception for this is that if you want all the columns to have the same values you are allowed to have that value as a single element in the list, for example df.loc[len(df)] = ['aa']
. 例外情况是,如果您希望所有列具有相同的值,则允许将该值作为列表中的单个元素,例如
df.loc[len(df)] = ['aa']
。
NOTE: a good idea will be to always use reset_index
before using this method because if you ever delete a row or work on a filtered dataframe you are not guaranteed that the rows' indexes will be in sync with the number of rows. 注意:一个好主意是在使用此方法之前始终使用
reset_index
,因为如果您删除行或处理过滤后的数据帧,则无法保证行的索引与行数同步。
You should append Series or DataFrame. 您应该追加Series或DataFrame。 (Series would be more appropriate in your case)
(在你的情况下系列会更合适)
import pandas as pd
from pandas import Series, DataFrame
class Security:
def __init__(self):
self.structure = ['timestamp', 'open', 'high', 'low', 'close', 'vol']
self.df = pd.DataFrame(columns=self.structure) # index =
def whats_inside(self):
return self.df
"""
Some skipped code...
"""
def add_data(self, timestamp, open, high, low, close, vol):
data = [timestamp, open, high, low, close, vol]
# append Series
self.df = self.df.append(pd.Series(data, index=self.structure), ignore_index=True)
# or DataFrame
# self.df = self.df.append(pd.DataFrame([data], columns=self.structure), ignore_index=True)
sec = Security()
print sec.whats_inside()
sec.add_data ('2015/06/01', '1', '2', '0.5', '1', '100')
sec.add_data ('2015/06/02', '1', '2', '0.5', '1', '100')
print sec.whats_inside()
Output: 输出:
timestamp open high low close vol
0 2015/06/01 1 2 0.5 1 100
1 2015/06/02 1 2 0.5 1 100
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.