[英]How to drop multiple rows with certain 1st level and 2nd level index?
I have a dataframe where:我有一个 dataframe 其中:
columnA columnB
name timestamp x x
To drop one row in a multiindex dataframe, I have this:要在多索引 dataframe 中删除一行,我有这个:
df.drop(my_timestamp, level=1, axis=0, inplace=True)
how can I drop one row with a certain 'name' and 'timestamp' index?如何删除具有特定“名称”和“时间戳”索引的一行?
how can I drop multiple rows for one name and a list of timestamps ?如何为一个名称和时间戳列表删除多行?
While it is typically recommended that each StackOverflow question should be limited to an single issue, these are close enough to being the same, that I will provide my solution for doing what you are looking for:虽然通常建议将每个 StackOverflow 问题限制在一个问题上,但这些问题几乎相同,我将提供我的解决方案来满足您的需求:
Given a df like:给定一个像这样的df:
A B
Name Date
AA 2018-01-31 -1 52
BB 2018-02-28 0 94
CC 2018-03-31 6 86
DD 2018-04-30 3 50
EE 2018-05-31 11 60
FF 2018-06-30 9 117
GG 2018-07-31 0 45
HH 2018-08-31 -3 62
# Drop a single row
df.drop('AA', level=0, axis=0, inplace=True)
Which removes the Name 'AA' from the dataframe and will in fact remove all 'AA' indexed items从 dataframe 中删除名称“AA”,实际上将删除所有“AA”索引项
To remove multiple rows you can use:要删除多行,您可以使用:
# Drop several timestamps
df.drop([pd.to_datetime('2018 03 31').date(), pd.to_datetime('2018 07 31').date()], level=1, axis=0, inplace=True)
In the case where you have multiple items indexed at level 0 but you want to remove one or more items from level 2 index you can use the following:如果您有多个在级别 0 索引的项目,但您想从级别 2 索引中删除一个或多个项目,您可以使用以下内容:
df.drop(('CC', pd.to_datetime('2018 03 31').date()), axis=0, inplace=True)
I m going to provide an answer based on the following dataframe example (you had to provide one actually):我将根据以下 dataframe 示例提供一个答案(您实际上必须提供一个):
columnA columnB
NameA 2016-01-01 12:00:00 p a
2017-01-01 12:00:00 q b
NameB 2018-01-01 12:00:00 r c
NameC 2019-01-01 12:00:00 s d
how can I drop one row with a certain 'name' and 'timestamp' index?如何删除具有特定“名称”和“时间戳”索引的一行?
Lets say you want to drop name with 'NameA' and timestamp with '2017-01-01 12:00:00' then you could use:假设您想删除带有“NameA”的名称和带有“2017-01-01 12:00:00”的时间戳,那么您可以使用:
df.drop(('NameA', pd.Timestamp(2017, 1, 1, 12)), axis=0)
output: output:
columnA columnB
NameA 2016-01-01 12:00:00 p a
NameB 2018-01-01 12:00:00 r c
NameC 2019-01-01 12:00:00 s d
how can I drop multiple rows for one name and a list of timestamps?如何删除一个名称和时间戳列表的多行?
You can use pd.MultiIndex.from_product
to create a multiindex that you want to drop.您可以使用pd.MultiIndex.from_product
创建要删除的多索引。
Example: you want to drop the two timestamps that are for 'NameA':示例:您要删除“NameA”的两个时间戳:
df.drop(
pd.MultiIndex.from_product([
['NameA'],
[pd.Timestamp(2016, 1, 1, 12), pd.Timestamp(2017, 1, 1, 12)]]),
axis=0
)
output: output:
columnA columnB
NameB 2018-01-01 12:00:00 r c
NameC 2019-01-01 12:00:00 s d
for dropping multiple rows for one name and a list of timestamps, you can take help of utility function like要为一个名称和时间戳列表删除多行,您可以借助实用程序 function 之类的
def drop_multi(df, ind_level_0:str, ind_level_1:list):
for ind_1 in ind_level_1:
df.drop((ind_level_0, ind_1), axis=0, inplace=True)
then call this function with desired arguments, in your case -然后在您的情况下使用所需的 arguments 调用此 function -
drop_multi(df,'Name', list_of_timestamps)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.