简体   繁体   English

用 python 列表中的内容填充 panda df 列

[英]Fill panda df column with content from a python list

I have a excel workbook with multiple sheets.我有一个包含多张工作表的 excel 工作簿。 I have created 2 dataframes and I am using a lists to fill in two columns in the dataframe 1 (df1).我创建了 2 个数据框,并使用列表填充 dataframe 1 (df1) 中的两列。 The lists are generated off the second df(df2).列表是从第二个 df(df2) 生成的。 I need to be able to fill-in using variables, since the data can change.我需要能够使用变量填写,因为数据可以改变。 My code is below, but I do not know how to do this, trying a for loop since need to increment through the list.我的代码在下面,但我不知道该怎么做,尝试一个 for 循环,因为需要在列表中递增。 Is there a better way to do this?有一个更好的方法吗?

Data excel sheet:资料 excel 表:

event  new_start  new_end criteria      old_time_start  \
0       sprint        NaN      NaN       GG 2021-02-01 07:00:00   
1         bike        NaN      NaN       JJ                 NaT   
2          run        NaN      NaN       JJ                 NaT   
3  check point        NaN      NaN       AA 2021-02-01 09:00:00   
4         swim        NaN      NaN       CC                 NaT   
5         walk        NaN      NaN       GG 2021-02-01 13:00:00   
6          jog        NaN      NaN       JJ                 NaT   
7         skip        NaN      NaN       CC                 NaT   
8       stroll        NaN      NaN       AA 2021-02-01 14:00:00   

Time excel sheet:时间excel表:

 start                 end        dur (min:sec)   event
0 2021-02-01 08:00:00 2021-02-01 08:45:00             10:00:00  Flag A
1 2021-02-01 09:00:00 2021-02-01 09:55:00             01:30:00  Flag C
2 2021-02-01 13:00:00 2021-02-01 13:49:00             16:10:00  Flag A
3 2021-02-01 14:00:00 2021-02-01 14:35:00             05:55:00  Flag B

Code:代码:

import pandas as pd
import os
    
    cur_dir = os.getcwd()
    file = cur_dir + "/test_data.xlsx"
    
    print(cur_dir)
    # create dfs
    df1=pd.read_excel(file, sheet_name="data", index_col=None)  
    df2=pd.read_excel(file, sheet_name="times", index_col=None)   
    
    #print(df1)
    #print(df2)
    
    # create a list of timestamps from df2, used to fill in data in df1
    new_start_list = df2["start"].tolist()
    new_end_list = df2["end"].tolist()
    
    # paste timestamp data from new_start_list and new_end_list into df2 columns 
    # when criteria is present in the column "criteria"
    
    ct = 0
    for i in df1:
        if df1.criteria == "GG":
            df1.new_start = new_start_list[ct]
            df1.new_end = new_end_list[ct]
            ct+=1
        elif df1.criteria == "AA":
            df1.new_start = new_start_list[i]
            df1.new_end = new_end_list[i]
            ct+=1
    
          
    # print out df to see if code works
    print(df1)

Screenshot: https://i.stack.imgur.com/hnxEg.png截图: https://i.stack.imgur.com/hnxEg.png

Try using pandas.DataFrame.fillna .尝试使用pandas.DataFrame.fillna Reference 参考

That method will help you replace NaN values.该方法将帮助您替换 NaN 值。 Just be aware of inplace parameter that if set to true , it will update your current df instance, otherwise it will return a cloned copy of the dataframe.请注意inplace参数,如果设置为true ,它将更新您当前的df实例,否则它将返回 dataframe 的克隆副本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM