[英]Python Panda append dataframe in loop
I am trying to append many data frames into one empty data frame but It is not working.我试图将许多数据帧附加到一个空数据帧中,但它不起作用。 For this, I am using this tutorial my code is like this:
为此,我使用本教程我的代码是这样的:
I am generating a frame inside a loop for that my code is:我正在循环内生成一个框架,因为我的代码是:
def loop_single_symbol(p1):
i = 0
delayedPrice = []
symbol = []
while i<5 :
print(p1)
h = get_symbol_data(p1)
delayedPrice.append(h['delayedPrice'])
symbol.append(h['symbol'])
i+=1
df = pd.DataFrame([], columns = [])
df["delayedPrice"] = delayedPrice
df["symbol"] = symbol
df["time"] = get_nyc_time()
return df
time.sleep(4)
This code is generating a frame like this:这段代码正在生成一个这样的框架:
delayedPrice symbol time
0 30.5 BAC 6:6
1 30.5 BAC 6:6
2 30.5 BAC 6:6
3 30.5 BAC 6:6
4 30.5 BAC 6:6
And I am running a loop like this:我正在运行这样的循环:
length = len(symbol_list())
data = ["BAC","AAPL"]
df = pd.DataFrame([], columns = [])
for j in range(length):
u = data[j]
if h:
df_of_single_symbol = loop_single_symbol(u)
print(df_of_single_symbol)
df.append(df_of_single_symbol, ignore_index = True)
print(df)
I am trying to append two or more data frame into one empty data frame but using the above code I am getting:我试图将两个或更多数据帧附加到一个空数据帧中,但使用上面的代码我得到了:
Empty DataFrame
Columns: []
Index: []
And I want a result like this:我想要这样的结果:
delayedPrice symbol time
0 30.5 BAC 6:6
1 30.5 BAC 6:6
2 30.5 BAC 6:6
3 30.5 BAC 6:6
4 30.5 BAC 6:6
0 209.15 AAPL 6:6
1 209.15 AAPL 6:6
2 209.15 AAPL 6:6
3 209.15 AAPL 6:6
4 209.15 AAPL 6:6
How can I do this using panda and what is the best possible way to do this.我怎样才能使用熊猫来做到这一点以及做到这一点的最佳方法是什么。
Note: Here this line注意:这里是这一行
h = get_symbol_data(p1)
Is fetching some data from API正在从 API 获取一些数据
As I mentioned in my comment, appending to pandas dataframes is not considered a very good approach.正如我在评论中提到的,附加到 Pandas 数据帧不被认为是一种很好的方法。 Instead, I suggest that you use something more appropriate to store the data, such as a file or a database if you want scalability.
相反,如果您想要可扩展性,我建议您使用更合适的东西来存储数据,例如文件或数据库。
Then you can use pandas for what it's built, ie data analysis by just reading the contents of the database or the file into a dataframe.然后,您可以使用 Pandas 来构建它,即通过将数据库或文件的内容读入数据帧进行数据分析。
Now, if you really want to stick with this approach, I suggest either join
or concat
to grow your dataframe as you get more data现在,如果您真的想坚持使用这种方法,我建议您在获得更多数据时
join
或concat
以扩大您的数据框
[EDIT] [编辑]
Example (from one of my scripts):示例(来自我的脚本之一):
results = pd.DataFrame()
for result_file in result_files:
df = parse_results(result_file)
results = pd.concat([results, df], axis=0).reset_index(drop=True)
parse_results
is a function that takes a filename and returns a dataframe formatted in the right way, up to you to make it fit your needs. parse_results
是一个函数,它接受一个文件名并返回一个以正确方式格式化的数据帧,由您决定是否适合您的需要。
As the comments stated, your original error is that you didn't assign the df.append
call to a variable - it returns the appended (new) DataFrame.正如评论所述,您的原始错误是您没有将
df.append
调用分配给变量 - 它返回附加的(新)DataFrame。
For anyone else looking to "extend" your DataFrame in-place (without an intermediate DB, List or Dictionary), here is a hint showing how to do this simply:对于希望就地“扩展”您的 DataFrame(没有中间数据库、列表或字典)的任何其他人,这里有一个提示,显示如何简单地执行此操作:
Pandas adding rows to df in loop 熊猫在循环中向df添加行
Basically, start with your empty DataFrame, already setup with the correct columns,基本上,从你的空 DataFrame 开始,已经设置了正确的列,
then use df.loc[ ]
indexing to assign the new Row of data to the end of the dataframe, where len(df)
will point just past the end of the DataFrame.然后使用
df.loc[ ]
索引将新的数据行分配到数据帧的末尾,其中len(df)
将指向数据帧的末尾。 It looks like this:它看起来像这样:
df.loc[ len(df) ] = ["my", "new", "data", "row"]
More detail in the linked hint . 链接提示中的更多详细信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.