简体   繁体   中英

Add a row to a dataframe based on a condition

I have a table of dates and a condition:

Start       End         Condition
03.10.2022  03.10.2022  0
03.10.2022  04.10.2022  1
03.10.2022  03.10.2022  0

How to add and transform an additional row based on a condition, like this:

Start       End         
03.10.2022  03.10.2022  
03.10.2022  03.10.2022
04.10.2022  04.10.2022  
03.10.2022  03.10.2022  

My thoughts were to use pd.explode - so first I need to add a new column with a list of values.

I've tried something like this, which resulted in shape errors:

df["new_col"] = np.where(df['Condition'] == 1, 
                         df[['Start', 'End']].values.tolist(),
                         df['Start'])

This is what I came up with:

# df initialization
data = pd.to_datetime('2022/10/03')
df = pd.DataFrame({'start':[data]*3,'end'[data,data+pd.DateOffset(days=1),data],'condition':[0,1,0]})

# find the rule used to add new rows
s = (df.end - df.start).dt.days * df.condition +1

#repeat index
df = df.loc[df.index.repeat(s)].copy()

# system datetimes
add = pd.to_timedelta(df.groupby(level=0).cumcount(), unit='d')
df['start'] = df["start"].add(add)
df['end'] = df["start"]

#default index
df = df.reset_index(drop=True)

If your table is in csv format, you can utilize pandas library to read and extract information.

import pandas as pd

df = pd.read_csv('table.csv')
df2 = df.iloc[:, 0:2] # remove the unwanted column in the dataframe
df2.loc[len(df.index)] = ['03.10.2022','03.10.2022'] # add a new row to the dataframe

And voila, df2 is your desired dataframe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM