I have 2 groupby dataframes (AUS_2016 and df_pitLaps, both grouped by driver ID) which have the same column ('lap'), and i am trying to subset AUS_2016 such that it does not contains values in that lap column in df_pitLaps for each driver.
To put it simply, I want to filter out pitlaps (laps with pit stop) for a particular race for each driver
I get a dataframe of True values grouped by driver ID, but I'm not sure how to proceed next.
AUS_2016:
df_pitLaps:
def clean_laps_no_pitlaps(data):
"""Filters out the pit laps."""
df_pitLaps = df_pitStops.loc[df_pitStops['raceId'].isin(data['raceId'])]
df_pitLaps.groupby("driverId")["lap"]
data = data.groupby("driverId")["lap"]
nopitlaps = lambda x: (
[(lap != pitlap) for pitlap, lap in itertools.izip(x, data)])
no_pitlaps_in_data = pd.DataFrame(data.apply(nopitlaps))
return no_pitlaps_in_data
Calling the function:
clean_laps_no_pitlaps(AUS_2016)
This give this error:
DeprecationWarning: elementwise != comparison failed; this will raise an error in the future.
and below dataframe. I am not sure how to continue on from here to only filter laps for each driver that are True (not pit laps).
SOLUTION:
I managed to resolve it by using another method which does not need groupby. I "vlooked-up" the laps with pit stops to df, then excluded these rows.
def no_pitlaps(df, df_pitLaps):
"""Returns a dataframe that excludes the pit laps of each driver"""
data_pitlaps_mapped = pd.merge(df, df_pitLaps[['driverId', 'stop', 'lap']], how='left',
left_on=['driverId','lap'], right_on=['driverId','lap'])
return data_pitlaps_mapped.loc[~data_pitlaps_mapped.index.isin(data_pitlaps_mapped.dropna(subset = ['stop']).index)]
如果您只想排除AUS_2016
中出现的df_pitLaps
中的所有圈数,只需执行以下操作即可:
AUS_2016[~AUS_2016['lap'].isin(df_pitLaps['lap'].unique())]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.