简体   繁体   English

在具有动态名称的熊猫中创建新数据框也添加新列

[英]Create new dataframe in pandas with dynamic names also add new column

I have a dataframe df我有一个数据框 df

 df = pd.DataFrame({'A':['-a',1,'a'], 
               'B':['a',np.nan,'c'],
               'ID':[1,2,2],
                't':[pd.tslib.Timestamp.now(),pd.tslib.Timestamp.now(),
                    np.nan]})

Added a new column添加了一个新列

df['YearMonth'] = df['t'].map(lambda x: 100*x.year + x.month)

Now I want to write a function or macro which will do date comparasion, create a new dataframe also add a new column to dataframe.现在我想编写一个函数或宏来进行日期比较,创建一个新的数据框并向数据框添加一个新列。

I tried like this but seems I am going wrong:我试过这样,但似乎我错了:

def test(df,ym):
    df_new=df
    if(ym <= df['YearMonth']):
        df_new+"_"+ym=df_new
        return df_new+"_"+ym
    df_new+"_"+ym['new_col']=ym

Now when I call test function I want a new dataframe should get created named as df_new_201612 and this new dataframe should have one more column, named as new_col that has value of ym for all the rows.现在,当我调用测试函数时,我希望创建一个名为df_new_201612的新数据框,并且这个新数据框应该还有一个名为new_col的列,所有行的值为ym

test(df,201612)

The output of new dataframe is:新数据帧的输出为:

df_new_201612 df_new_201612

A   B   ID  t                           YearMonth   new_col
-a  a   1   2016-12-05 12:37:56.374620  201612      201612 
1   NaN 2   2016-12-05 12:37:56.374644  201208      201612 
a   c   2   nat                         nan         201612 

Creating variables with dynamic names is typically a bad practice.使用动态名称创建变量通常是一种不好的做法。

I think the best solution for your problem is to store your dataframes into a dictionary and dynamically generate the name of the key to access each dataframe.我认为您的问题的最佳解决方案是将您的数据帧存储到字典中并动态生成访问每个数据帧的键名。

import copy

dict_of_df = {}
for ym in [201511, 201612, 201710]:

    key_name = 'df_new_'+str(ym)    

    dict_of_df[key_name] = copy.deepcopy(df)

    to_change = df['YearMonth']< ym
    dict_of_df[key_name].loc[to_change, 'new_col'] = ym   

dict_of_df.keys()
Out[36]: ['df_new_201710', 'df_new_201612', 'df_new_201511']

dict_of_df
Out[37]: 
{'df_new_201511':     A    B  ID                       t  YearMonth  new_col
 0  -a    a   1 2016-12-05 07:53:35.943     201612   201612
 1   1  NaN   2 2016-12-05 07:53:35.943     201612   201612
 2   a    c   2 2016-12-05 07:53:35.943     201612   201612,
 'df_new_201612':     A    B  ID                       t  YearMonth  new_col
 0  -a    a   1 2016-12-05 07:53:35.943     201612   201612
 1   1  NaN   2 2016-12-05 07:53:35.943     201612   201612
 2   a    c   2 2016-12-05 07:53:35.943     201612   201612,
 'df_new_201710':     A    B  ID                       t  YearMonth  new_col
 0  -a    a   1 2016-12-05 07:53:35.943     201612   201710
 1   1  NaN   2 2016-12-05 07:53:35.943     201612   201710
 2   a    c   2 2016-12-05 07:53:35.943     201612   201710}

 # Extract a single dataframe
 df_2015 = dict_of_df['df_new_201511']

There is a more easy way to accomplish this using exec method.使用exec方法有一种更简单的方法来完成此操作。 The following steps can be done to create a dataframe at runtime.可以执行以下步骤以在运行时创建数据框。

1.Create the source dataframe with some random values. 1.使用一些随机值创建源数据框。

import numpy as np
import pandas as pd
    
df = pd.DataFrame({'A':['-a',1,'a'], 
                   'B':['a',np.nan,'c'],
                   'ID':[1,2,2]})

2.Assign a variable that holds the new dataframe name. 2.分配一个包含新数据框名称的变量。 You can even send this value as a parameter or loop it dynamically.您甚至可以将此值作为参数发送或动态循环它。

new_df_name = 'df_201612'

3.Create dataframe dynamically using exec method to copy data from source dataframe to the new dataframe dynamically and in the next line assign a value to new column. 3.使用exec方法动态创建数据帧,将数据从源数据帧动态复制到新数据帧,并在下一行为新列赋值。

exec(f'{new_df_name} = df.copy()')
exec(f'{new_df_name}["new_col"] = 123') 

4.Now the dataframe df_201612 will be available on the memory and you can execute print statement along with eval to verify this. 4.现在数据帧df_201612将在内存中可用,您可以执行print语句和eval来验证这一点。

print(eval(new_df_name))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从现有的列名在 Pandas DataFrame 中创建一个新列 - Create a new column in a Pandas DataFrame from exisiting column names pandas Dataframe 创建新列 - pandas Dataframe create new column 我如何使用 python pandas 数据框并使用列名和行名作为新列创建一个新表 - how do i take a python pandas dataframe and create a new table using the column and row names as the new column 熊猫使用旧的列名称创建新的数据框 - Pandas make a new dataframe with the old column names 如何处理Pandas中的2列并使用新的列名称创建新的数据框 - How do I process 2 columns in Pandas and create a new dataframe with new column names 如何从另一列的所有值创建新的列名并按 pandas dataframe 中的另一列创建新列名? - how to create new column names from another column all values and agg by another column in pandas dataframe? 在pyspark Dataframe上创建新的架构或列名称 - Create new schema or column names on pyspark Dataframe 从pandas dataframe创建新的动态字典 - Create new dynamic dictionary from pandas dataframe 在pandas数据框中创建新列,以合并特定的列名称和相应的值 - Create new column in pandas dataframe that merges specific column names and corresponding values pandas 数据框基于包含一列名称的另一个文件创建一个新的二进制活动列 - pandas dataframe create a new binary activity column based on another file containg a column of names
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM