简体   繁体   English

使用 for 循环将一行(熊猫)与下一行进行比较,如果不同,则从列中获取值

[英]Compare a row (pandas) with the next row using for loop, and if not the same get a value from a column

I have this pandas Dataframe:我有这个 pandas Dataframe:

         full path                               name      time 
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30
3    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40
4    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
5    C:\Users\User\Desktop\Test1\Test2\1.txt    2.txt      10:60

I want to compare all rows with the same name and the same path and if the paths changes get time and folder moved to.我想比较具有相同名称和相同路径的所有行,如果路径发生更改,则将时间和文件夹移动到。 For example first row comparing with the second row has no changes in 'name' and 'full path' so it should pass.例如,第一行与第二行相比,“名称”和“完整路径”没有变化,所以它应该通过。 Then second row comparing the third row, name is the same but the path is changed, so I need to get the time for example the time of the third row "10:30 and the folder (Test2)" and put it in a new column.然后第二行比较第三行,名称相同但路径改变了,所以我需要获取时间例如第三行“10:30和文件夹(Test2)”的时间并将其放入新的柱子。

The desired output is:所需的 output 是:

         full path                               name      time    time_when_path_changed
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30       10:30 - Test2
3    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40       10:40 - Test1
4    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
5    C:\Users\User\Desktop\Test1\Test2\1.txt    2.txt      10:60       10:60  - Test2

EDITED:编辑:

Yes, @erfan it worked perfectly for the problem I described but I wrote names in name in order like 1 1 1 but when I have a data frame like below it didn't work.是的,@erfan 它完美地解决了我描述的问题,但是我按照 1 1 1 的顺序写了名字,但是当我有一个像下面这样的数据框时,它就不起作用了。 I also made a modification in the desired output.我还在所需的 output 中进行了修改。 Do you have solution for this also.你也有这个解决方案。

Thanks in advance.提前致谢。

         full path                               name      time 
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
2    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30
3    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40
5    C:\Users\User\Desktop\Test1\Test2\2.txt    2.txt      10:60

Desired output:所需的 output:

         full path                               name      time    moved to "Test2"   moved to "Test1"
0    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:20
1    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:25
2    C:\Users\User\Desktop\Test1\2.txt          2.txt      10:50
3    C:\Users\User\Desktop\Test1\Test2\1.txt    1.txt      10:30       10:30
5    C:\Users\User\Desktop\Test1\1.txt          1.txt      10:40                            10:40
5    C:\Users\User\Desktop\Test1\Test2\2.txt    2.txt      10:60       10:60

We can use the following logic:我们可以使用以下逻辑:

  1. If full path is not equal to row before如果full path不等于之前的行
  2. name is equal to row before (same groups) name等于之前的行(相同的组)
  3. If point 1 & 2 is True, we get the time + deepest path如果第 1 点和第 2 点为真,我们得到time + 最深路径
m1 = df["full path"].ne(df["full path"].shift(1, fill_value=df["full path"].iloc[0]))
m2 = df["name"].eq(df["name"].shift(fill_value=df["name"].iloc[0]))

folder = df["full path"].str.rsplit("\\", 2).str[-2]

df["time_when_path_changed"] = np.where(m1 & m2, df["time"] + " - " + folder, "")
                                 full path   name   time  \
0        C:\Users\User\Desktop\Test1\1.txt  1.txt  10:20   
1        C:\Users\User\Desktop\Test1\1.txt  1.txt  10:25   
2  C:\Users\User\Desktop\Test1\Test2\1.txt  1.txt  10:30   
3        C:\Users\User\Desktop\Test1\1.txt  1.txt  10:40   
4        C:\Users\User\Desktop\Test1\2.txt  2.txt  10:50   
5  C:\Users\User\Desktop\Test1\Test2\1.txt  2.txt  10:60   

  time_when_path_changed  
0                         
1                         
2          10:30 - Test2  
3          10:40 - Test1  
4                         
5          10:60 - Test2  

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Python Pandas 将列值与不同列进行比较,并从同一行但不同列返回值 - Compare a Column value to different columns and return a value from same row but different column using Python Pandas 使用pandas在csv文件的同一行上填充下一列值的行中的空值 - Fill empty values from a row with the value of next column on the same row on csv file with pandas 通过比较当前行列与熊猫中的下一行列来获取最小日期值 - Get minimum date value from comparison of current row colum vs next row column in pandas Pandas,如何将一行中的值与同一列中的所有其他行进行比较,并将其作为新列中的新行值添加? - Pandas, how to compare the value from one row with all other rows in the same column and add it as a new row value in a new column? 熊猫比较下一行 - Pandas compare next row Python/Pandas:如果值为 NaN 或 0,则用同一行内下一列的值填充 - Python/Pandas: if value is NaN or 0 then fill with the value from the next column within the same row 如何使用 Python Pandas 从 1 行中的特定列获取值? - How to get value from specific column in 1 row using Python Pandas? 如何使用 pandas 比较同一行中多列的单列值? - How to compare a value of a single column over multiple columns in the same row using pandas? 从每个客户 ID 的下一列中识别最后一个是值和 0,然后从前一行下一列熊猫中获取值 - Identify the last yes value & 0 from next column for each customer id then get the value from previous row next column pandas 比较下一行值并使用pandas python更改当前行值 - compare the next row value and change the current row value using pandas python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM