I have this pandas Dataframe:
full path name time
0 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:20
1 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:25
2 C:\Users\User\Desktop\Test1\Test2\1.txt 1.txt 10:30
3 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:40
4 C:\Users\User\Desktop\Test1\2.txt 2.txt 10:50
5 C:\Users\User\Desktop\Test1\Test2\1.txt 2.txt 10:60
I want to compare all rows with the same name and the same path and if the paths changes get time and folder moved to. For example first row comparing with the second row has no changes in 'name' and 'full path' so it should pass. Then second row comparing the third row, name is the same but the path is changed, so I need to get the time for example the time of the third row "10:30 and the folder (Test2)" and put it in a new column.
The desired output is:
full path name time time_when_path_changed
0 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:20
1 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:25
2 C:\Users\User\Desktop\Test1\Test2\1.txt 1.txt 10:30 10:30 - Test2
3 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:40 10:40 - Test1
4 C:\Users\User\Desktop\Test1\2.txt 2.txt 10:50
5 C:\Users\User\Desktop\Test1\Test2\1.txt 2.txt 10:60 10:60 - Test2
EDITED:
Yes, @erfan it worked perfectly for the problem I described but I wrote names in name in order like 1 1 1 but when I have a data frame like below it didn't work. I also made a modification in the desired output. Do you have solution for this also.
Thanks in advance.
full path name time
0 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:20
1 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:25
2 C:\Users\User\Desktop\Test1\2.txt 2.txt 10:50
2 C:\Users\User\Desktop\Test1\Test2\1.txt 1.txt 10:30
3 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:40
5 C:\Users\User\Desktop\Test1\Test2\2.txt 2.txt 10:60
Desired output:
full path name time moved to "Test2" moved to "Test1"
0 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:20
1 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:25
2 C:\Users\User\Desktop\Test1\2.txt 2.txt 10:50
3 C:\Users\User\Desktop\Test1\Test2\1.txt 1.txt 10:30 10:30
5 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:40 10:40
5 C:\Users\User\Desktop\Test1\Test2\2.txt 2.txt 10:60 10:60
We can use the following logic:
full path
is not equal to row beforename
is equal to row before (same groups) time
+ deepest pathm1 = df["full path"].ne(df["full path"].shift(1, fill_value=df["full path"].iloc[0]))
m2 = df["name"].eq(df["name"].shift(fill_value=df["name"].iloc[0]))
folder = df["full path"].str.rsplit("\\", 2).str[-2]
df["time_when_path_changed"] = np.where(m1 & m2, df["time"] + " - " + folder, "")
full path name time \
0 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:20
1 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:25
2 C:\Users\User\Desktop\Test1\Test2\1.txt 1.txt 10:30
3 C:\Users\User\Desktop\Test1\1.txt 1.txt 10:40
4 C:\Users\User\Desktop\Test1\2.txt 2.txt 10:50
5 C:\Users\User\Desktop\Test1\Test2\1.txt 2.txt 10:60
time_when_path_changed
0
1
2 10:30 - Test2
3 10:40 - Test1
4
5 10:60 - Test2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.