简体   繁体   中英

Compare a row (pandas) with the next row using for loop, and if not the same get a value from a column

I have this pandas Dataframe:

         full path                               name      time 
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30
3    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40
4    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
5    C:\Users\User\Desktop\Test1\Test2\1.txt    2.txt      10:60

I want to compare all rows with the same name and the same path and if the paths changes get time and folder moved to. For example first row comparing with the second row has no changes in 'name' and 'full path' so it should pass. Then second row comparing the third row, name is the same but the path is changed, so I need to get the time for example the time of the third row "10:30 and the folder (Test2)" and put it in a new column.

The desired output is:

         full path                               name      time    time_when_path_changed
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30       10:30 - Test2
3    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40       10:40 - Test1
4    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
5    C:\Users\User\Desktop\Test1\Test2\1.txt    2.txt      10:60       10:60  - Test2

EDITED:

Yes, @erfan it worked perfectly for the problem I described but I wrote names in name in order like 1 1 1 but when I have a data frame like below it didn't work. I also made a modification in the desired output. Do you have solution for this also.

Thanks in advance.

         full path                               name      time 
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
2    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30
3    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40
5    C:\Users\User\Desktop\Test1\Test2\2.txt    2.txt      10:60

Desired output:

         full path                               name      time    moved to "Test2"   moved to "Test1"
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
3    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30       10:30
5    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40                            10:40
5    C:\Users\User\Desktop\Test1\Test2\2.txt    2.txt      10:60       10:60

We can use the following logic:

  1. If full path is not equal to row before
  2. name is equal to row before (same groups)
  3. If point 1 & 2 is True, we get the time + deepest path
m1 = df["full path"].ne(df["full path"].shift(1, fill_value=df["full path"].iloc[0]))
m2 = df["name"].eq(df["name"].shift(fill_value=df["name"].iloc[0]))

folder = df["full path"].str.rsplit("\\", 2).str[-2]

df["time_when_path_changed"] = np.where(m1 & m2, df["time"] + " - " + folder, "")
                                 full path   name   time  \
0        C:\Users\User\Desktop\Test1\1.txt  1.txt  10:20   
1        C:\Users\User\Desktop\Test1\1.txt  1.txt  10:25   
2  C:\Users\User\Desktop\Test1\Test2\1.txt  1.txt  10:30   
3        C:\Users\User\Desktop\Test1\1.txt  1.txt  10:40   
4        C:\Users\User\Desktop\Test1\2.txt  2.txt  10:50   
5  C:\Users\User\Desktop\Test1\Test2\1.txt  2.txt  10:60   

  time_when_path_changed  
0                         
1                         
2          10:30 - Test2  
3          10:40 - Test1  
4                         
5          10:60 - Test2  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM