简体   繁体   中英

I want to attach one df column below another and assign row values based on other columns

I have a dataframe constructed using the code below. I want to attach the end and end_code column underneath the start and start_code column. The values for num and code when attached below can be set to zero. I have also attached my expected output below.

Currently I am doing it by using a for loop to go through each row and manually assigning it at the end. But, the process is very time consuming.
Thanks

a['Code'] = ['A', 'B', 'A', 'C']
a['start'] = [pd.to_datetime('07:00'), pd.to_datetime('08:40'), pd.to_datetime('09:00'), pd.to_datetime('10:00')]
a['end'] = [pd.to_datetime('11:45'), pd.to_datetime('12:40'), pd.to_datetime('14:00'), pd.to_datetime('17:00')]
a['start_code'] = [1, 1, 1, 1]
a['end_code'] = [-1, -1, -1, -1]

Num Code    start              start_code
0   1   A   2019-07-30 07:00:00  1
1   2   B   2019-07-30 08:40:00  1
2   3   A   2019-07-30 09:00:00  1
3   4   C   2019-07-30 10:00:00  1
4   0   0   2019-07-30 11:45:00 -1
5   0   0   2019-07-30 12:40:00 -1
6   0   0   2019-07-30 14:00:00 -1
7   0   0   2019-07-30 17:00:00 -1


Assuming you have:

start_df
Out[1]:
Num Code    start              start_code
0   1   A   2019-07-30 07:00:00  1
1   2   B   2019-07-30 08:40:00  1
2   3   A   2019-07-30 09:00:00  1
3   4   C   2019-07-30 10:00:00  1

end_df
Out[2]:
 end              end_code
2019-07-30 11:45:00 -1
2019-07-30 12:40:00 -1
2019-07-30 14:00:00 -1
2019-07-30 17:00:00 -1

You can simply concatenate:

pd.concat([
    start_df, end_df.rename(columns={'end':'start', 'end_code': 'start_code')
], axis=0, sort=False, ignore_index=True).fillna(0)

Out[3]:
   Num Code start                start_code
0   1   A   2019-07-30 07:00:00  1
1   2   B   2019-07-30 08:40:00  1
2   3   A   2019-07-30 09:00:00  1
3   4   C   2019-07-30 10:00:00  1
4   0   0   2019-07-30 11:45:00 -1
5   0   0   2019-07-30 12:40:00 -1
6   0   0   2019-07-30 14:00:00 -1
7   0   0   2019-07-30 17:00:00 -1

I believe melt maybe a better choice for you:

new_df = a.melt(id_vars='Code', 
       value_vars=['start','end'], 
       var_name='start_code',
       value_name='start')

new_df['start_code'] = np.where(new_df['start_code'].eq('start'), 1, -1)

Output:

  Code  start_code               start
0    A           1 2019-07-30 07:00:00
1    B           1 2019-07-30 08:40:00
2    A           1 2019-07-30 09:00:00
3    C           1 2019-07-30 10:00:00
4    A          -1 2019-07-30 11:45:00
5    B          -1 2019-07-30 12:40:00
6    A          -1 2019-07-30 14:00:00
7    C          -1 2019-07-30 17:00:00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM