[英]How to convert data of type Panda to Panda.Dataframe?
I have a object of which type is Panda and the print(object) is giving below output我有一个类型为 Panda 的对象并且 print(object) 给出以下输出
print(type(recomen_total))
print(recomen_total)
Output is输出是
<class 'pandas.core.frame.Pandas'>
Pandas(Index=12, instrument_1='XXXXXX', instrument_2='XXXX', trade_strategy='XXX', earliest_timestamp='2016-08-02T10:00:00+0530', latest_timestamp='2016-08-02T10:00:00+0530', xy_signal_count=1)
I want to convert this obejct in pd.DataFrame, how i can do it?我想在 pd.DataFrame 中转换这个对象,我该怎么做?
i tried pd.DataFrame(object), from_dict also, they are throwing error我也试过 pd.DataFrame(object), from_dict, 他们抛出错误
Interestingly, it will not convert to a dataframe directly but to a series.有趣的是,它不会直接转换为数据帧,而是转换为系列。 Once this is converted to a series use the to_frame method of series to convert it to a DataFrame
一旦将其转换为系列,请使用系列的 to_frame 方法将其转换为 DataFrame
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
index=['a', 'b'])
for row in df.itertuples():
print(pd.Series(row).to_frame())
Hope this helps!!希望这有帮助!!
In case you want to save the column names use the _asdict() method like this:如果您想保存列名,请使用 _asdict() 方法,如下所示:
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
index=['a', 'b'])
for row in df.itertuples():
d = dict(row._asdict())
print(pd.Series(d).to_frame())
Output:
0
Index a
col1 1
col2 0.1
0
Index b
col1 2
col2 0.2
To create new DataFrame from itertuples namedtuple you can use list() or Series too:要从命名元组的迭代创建新的 DataFrame,您也可以使用 list() 或 Series:
import pandas as pd
# source DataFrame
df = pd.DataFrame({'a': [1,2], 'b':[3,4]})
# empty DataFrame
df_new_fromAppend = pd.DataFrame(columns=['x','y'], data=None)
for r in df.itertuples():
# create new DataFrame from itertuples() via list() ([1:] for skipping the index):
df_new_fromList = pd.DataFrame([list(r)[1:]], columns=['c','d'])
# or create new DataFrame from itertuples() via Series (drop(0) to remove index, T to transpose column to row)
df_new_fromSeries = pd.DataFrame(pd.Series(r).drop(0)).T
# or use append() to insert row into existing DataFrame ([1:] for skipping the index):
df_new_fromAppend.loc[df_new_fromAppend.shape[0]] = list(r)[1:]
print('df_new_fromList:')
print(df_new_fromList, '\n')
print('df_new_fromSeries:')
print(df_new_fromSeries, '\n')
print('df_new_fromAppend:')
print(df_new_fromAppend, '\n')
Output:输出:
df_new_fromList:
c d
0 2 4
df_new_fromSeries:
1 2
0 2 4
df_new_fromAppend:
x y
0 1 3
1 2 4
To omit index, use param index=False (but I mostly need index for the iteration)要省略索引,请使用 param index=False (但我主要需要迭代索引)
for r in df.itertuples(index=False):
# the [1:] needn't be used, for example:
df_new_fromAppend.loc[df_new_fromAppend.shape[0]] = list(r)
The following works for me:以下对我有用:
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]}, index=['a', 'b'])
for row in df.itertuples():
row_as_df = pd.DataFrame.from_records([row], columns=row._fields)
print(row_as_df)
The result is:结果是:
Index col1 col2 0 a 1 0.1 Index col1 col2 0 b 2 0.2
Sadly, AFAIU, there's no simple way to keep column names, without explicitly utilizing "protected attributes" such as _fields
.可悲的是,AFAIU,没有简单的方法来保留列名,而不显式使用“受保护的属性”,例如
_fields
。
With some tweaks in @Igor's answer在@Igor 的回答中做了一些调整
I concluded with this satisfactory code which preserved column names and used as less of pandas code as possible.我以这个令人满意的代码结束,它保留了列名并尽可能少地使用 pandas 代码。
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]})
# Or initialize another dataframe above
# Get list of column names
column_names = df.columns.values.tolist()
filtered_rows = []
for row in df.itertuples(index=False):
# Some code logic to filter rows
filtered_rows.append(row)
# Convert pandas.core.frame.Pandas to pandas.core.frame.Dataframe
# Combine filtered rows into a single dataframe
concatinated_df = pd.DataFrame.from_records(filtered_rows, columns=column_names)
concatinated_df.to_csv("path_to_csv", index=False)
The result is a csv containing:结果是包含以下内容的 csv:
col1 col2 1 0.1 2 0.2
To convert a list of objects returned by Pandas .itertuples
to a DataFrame, while preserving the column names:将 Pandas
.itertuples
返回的对象列表转换为 DataFrame,同时保留列名:
# Example source DF
data = [['cheetah', 120], ['human', 44.72], ['dragonfly', 54]]
source_df = pd.DataFrame(data, columns=['animal', 'top_speed'])
animal top_speed
0 cheetah 120.00
1 human 44.72
2 dragonfly 54.00
Since Pandas does not recommended building DataFrames by adding single rows in a for loop, we will iterate and build the DataFrame at the end:由于 Pandas 不建议通过在 for 循环中添加单行来构建 DataFrame,因此我们将在最后迭代并构建 DataFrame:
WOW_THAT_IS_FAST = 50
list_ = list()
for animal in source_df.itertuples(index=False, name='animal'):
if animal.top_speed > 50:
list_.append(animal)
Now build the DF in a single command and without manually recreating the column names.现在在单个命令中构建 DF,无需手动重新创建列名。
filtered_df = pd.DataFrame(list_)
animal top_speed
0 cheetah 120.00
2 dragonfly 54.00
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.