[英]Iterate through columns of pandas dataframe and create a new dataframe for each selected column in a loop
I have a pandas dataframe with multiple columns and I am trying to iterate through the dataframe by selecting one column at a time, create a new dataframe with that one column, perform some functions.我有一个 pandas dataframe 有多个列,我试图通过一次选择一列来遍历 dataframe,用该列创建一个新的 dataframe,执行一些功能。 Then select the next column in the dataframe, perform functions and continue the process until I reach the last column in the dataframe.
然后 select dataframe 的下一列,执行功能并继续该过程,直到我到达 dataframe 的最后一列。
Currently, I am doing it with only one column.目前,我只用一列来做。 I am stuck on how to do this in a loop and run the functions inside a loop.
我坚持如何在循环中执行此操作并在循环内运行函数。 Could someone please help on how I can iterate through the columns in a loop, create a new dataframe for each selected column and run the functions inside that loop.
有人可以帮助我如何遍历循环中的列,为每个选定的列创建一个新的 dataframe 并在该循环内运行函数。
df: df:
date Col1 Col2 Col3 Col4
1990-01-02 12:00:00 24 24 24.8 24.8
1990-01-02 01:00:00 59 58 60 60.3
1990-01-02 02:00:00 43.7 43.9 48 49
Code代码
df_new = pd.DataFrame(df['Col1'])
df.reset_index(inplace=True)
def function1(df_new):
line 1
line 2
def function2():
line 1
line 2
The answer I am looking for is something like below, where I just have to iterate over the columns and perform the same set of functions for each.我正在寻找的答案如下所示,我只需要遍历列并为每个列执行相同的一组功能。 IS there a better way to do this?
有一个更好的方法吗?
for col in df.columns:
col_df = df_full[[col]]
col_df.reset_index(inplace=True)
col_df
#perform functions on col_df
If you insist on iterating through columns then you'll have a series for every column, in which case I don't see the added value of converting it to a DataFrame first.如果您坚持遍历列,那么每一列都会有一个系列,在这种情况下,我看不到首先将其转换为 DataFrame 的附加值。
Instead, perform the functions on each series:相反,对每个系列执行功能:
def Add(col):
return col+1
def Minus(col):
return col-1
def Double(col):
return col*2
for col in df.columns:
print(Add(df[col]))
Minus(df[col])
Double(df[col])
Be sure to save the results if you want to do further manipulations with them when the loop is finished.如果您想在循环结束时对结果进行进一步操作,请务必保存结果。
However, I advise instead looking at other possibilities, for example using apply()
and lambda
:但是,我建议改为查看其他可能性,例如使用
apply()
和lambda
:
df.apply(lambda x: x+1 , axis=0)
This is much more efficient.这样效率更高。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.