For the entire data frame I need to compare 4 dates all on the same row. Find the latest date and highlight it. The highlighted cell is the highest between comp1 - comp4.
The output I need will look like this:
I started by making sure all comps were date times dtypes and I even tried making them objects and comparing them before writing this but with no luck.
Here is what I have tried/searched on line for but none of these work:
checks.style.highlight_max(color= 'yellow', axis=0)
Nothing gets highlighted
I also tried to use subsets but for some reason no matter if check the dtypes on each comp they non not stay a date time or an object but instead become a float for some odd reason
checks.style.highlight_max(color= 'yellow', axis=0, subset=['CAC Clearance', 'ASB Results Received','Arch Assessment','Bio Assessment'])
This is the error i get but I have them all as date times before I run it.
TypeError: '>=' not supported between instances of 'float' and 'datetime.date'
Lastly I tried to do a groupby the ID and even that way I cant not seem to get it to work.
example date using print(checks.head().to_records())/print(checks.head().to_dict())
outputs (only can give certain info for now timestamps)
TypeError Traceback (most recent call last) in ----> 1 print(checks.head().to_records())/print(checks.head().to_dict())
TypeError: unsupported operand type(s) for /: 'NoneType' and 'NoneType'
1st print example:
'2021-10-13T00:00:00.000000000', '2021-10-13T00:00:00.000000000')
2nd print example:
Timestamp('2021-10-13 00:00:00'), 4: Timestamp('2021-10-13 00:00:00')}, 'Bio Assessment': {0: Timestamp('2021-10-13 00:00:00'), 1: Timestamp('2021-10-14 00:00:00'), 2: Timestamp('2021-10-13 00:00:00'), 3: Timestamp('2021-10-13 00:00:00'), 4: Timestamp('2021-10-13 00:00:00')}}
I figured it out.
first had to copy my df to stop copy warning
then use this code to make all my date time string and fill in the NaT with "0"s. This was the only way I could compare with out a str/int to datetime/timestamp error:
checks['comp1'] = checks['comp1'].dt.strftime('%Y-%m-%d').fillna("0")
I tried to use the highlight style above in the original post but only a few dates would highlight so I made this long function but it works.
side note . It seems like I'm comparing the same comps twice but if I did it any other way some comparisons would not compare due to some comp1s were blank, comp2, etc. The function would go starting into the second matching part.
Not all data is filled out for this contract but latest date was needed for over 600,000 records with 1-4 comps.
def find_lastest_date(df, comp1, comp2, comp3, comp4): # compares comp1 to all other comps if ((df[comp1] > df[comp2]) & (df[comp1] > df[comp3]) & (df[comp1] > df[comp4])): return 'comp1 Latest Date' # compares comp2 to all other comps elif ((df[comp2] > df[comp1]) & (df[comp2] > df[comp3]) & (df[comp2] > df[comp4])): return "comp2 Latest Date" # compares comp3 to all other comps elif ((df[comp3] > df[comp1]) & (df[comp3] > df[comp2]) & (df[comp3] > df[comp4])): return 'comp3 Latest Date' # compares comp4 to all other comps elif ((df[comp4] > df[comp1]) & (df[comp4] > df[comp2]) & (df[comp4] > df[comp3])): return 'comp4 Latest Date' # Comp matches # All comps == "0" leave blank elif ((df[comp1] == "0") & (df[comp2] == "0") & (df[comp3] == "0") & (df[comp4] == "0")): return "" # All comps macth elif ((df[comp1] == df[comp2]) & (df[comp1] == df[comp3]) & (df[comp1] == df[comp4])): return "Lastest Date has Matches" # comparing 3 comp matches # comp1 match only comp2 & comp3 | comp1 matches 3 & 4 elif ((df[comp1] == df[comp2]) & (df[comp1] == df[comp3])) | ((df[comp1] == df[comp3]) & (df[comp1] == df[comp4])): return "Lastest Date has Matches" # comp 2 match only comp1 & comp3 | comp1 matches 3 & 4 elif ((df[comp1] == df[comp2]) & (df[comp2] == df[comp3])) | ((df[comp2] == df[comp3]) & (df[comp2] == df[comp4])): return "Lastest Date has Matches" # comp 3 match only comp1 & comp2 | comp1 matches 2 & 4 elif ((df[comp3] == df[comp1]) & (df[comp3] == df[comp2])) | ((df[comp3] == df[comp2]) & (df[comp3] == df[comp4])): return "Lastest Date has Matches" # comp 4 match only comp1 & comp2 | comp4 matches 2 & 3 elif ((df[comp4] == df[comp1]) & (df[comp4] == df[comp2])) | ((df[comp4] == df[comp2]) & (df[comp3] == df[comp4])): return "Lastest Date has Matches" # 2 comps match # comp1 match to another other comp elif ((df[comp1] == df[comp2]) | (df[comp1] == df[comp3]) | (df[comp1] == df[comp4])): return "Lastest Date has Matches" # comp2 match to another other comp elif ((df[comp2] == df[comp1]) | (df[comp2] == df[comp3]) | (df[comp2] == df[comp4])): return "Lastest Date has Matches" # comp3 match to another comp elif ((df[comp3] == df[comp1]) | (df[comp3] == df[comp2]) | (df[comp3] == df[comp4])): return "Lastest Date has Matches" else: return ""
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.