I want to write a piece of code in pandas
where it gets 2 data frames Unix,Unix2
compares them and outputs the range of indexes where there are differences. For example index 1 has 1444311780
for Unix
and 1444311790
for Unix2
the values of Unix
and Unix2
are different so it would make index 1 to be the starting range. Ending range would be the last consecutive value of an inequality so which would be for index 2 which compares 1635686040
and 1635686034
with Unix, Unix2
respectably.
import time
import datetime
import pandas as pd
Unix= pd.DataFrame([1444311600, 1444311780, 1635686040, 1635686200, 1635686220])
Unix2 = pd.DataFrame([1444311600, 1444311790, 1635686034, 1635686200, 1635686230])
Expected Output:
first last
1 2
4
If I understand you correctly, you want to find the start and end index of every unequal streak. Try this:
# Compare Unix to Unix2, row-by-row
s = Unix[0] != Unix2[0]
# Assign the group number. Every time `s` flips from True to False
# or vice-versa, make a new group
t = s.ne(s.shift()).cumsum()
# Filter for the groups whose members are all True
u = t[s]
# For those groups, find the min and the max index of their members
result = u.index.to_series().groupby(u).agg(['min', 'max'])
Output:
min max
0
2 1 2
4 4 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.