[英]Case insensitive filtering multiple columns in pandas using .loc
我想搜索忽略案例差異的值。 因此,例如,如果我輸入“fred”,我仍然能夠過濾所有包含 Fred 的值,即使 F 是大寫的。
這就是我目前擁有的:
def find(**kwargs):
result = data.loc[data.rename(columns={"FirstName": "first",
"LastName": "last",
"City": "city",
})[list(kwargs.keys())]
.eq(list(kwargs.values())).all(axis=1)]
return result
但是,我意識到我不能在任何時候使用 .lower() 強制小寫我傳入的字符串和我過濾的值
這是我的數據示例:
FirstName LastName City
Fred Bob Austin
Billy Bob NYC
當我運行我的函數時,我期望這樣:
find('fred')
Output: Fred Bob Austin
import pandas as pd
data = pd.DataFrame({"FirstName": ['Fred', 'Billy'], 'LastName':['Bob','Bob'], 'City': ['A', 'D']} )
def find(**kwargs):
result = data.loc[data.rename(columns={"FirstName": "first",
"LastName": "last",
"City": "city",
})[list(kwargs.keys())].apply(lambda x: x.str.lower()).eq(list(kwargs.values())).all(axis=1)]
return result
print(find(first='fred'))
回報
FirstName LastName City
0 Fred Bob A
這里有兩種方法可以完成我相信你已經問過的事情,即:
first
、 last
和city
的任意組合對 df 列FirstName
、 LastName
和City
進行不區分大小寫的過濾。方式#1
import pandas as pd
def find(**kwargs):
df = ( data.rename(columns={"FirstName": "first",
"LastName": "last",
"City": "city",
})[list(kwargs.keys())]
.apply(lambda x: x.str.lower(), axis=1) )
mask = df.eq(list(val.lower() for val in kwargs.values())).all(axis=1)
return data[mask]
data = pd.DataFrame({'FirstName':['Fred','Billy'],'LastName':['Bob','Bob'],'City':['Austin','NYC']})
方式#2
import pandas as pd
from operator import and_
from functools import reduce
def find(**kwargs):
df = data.rename(columns={"FirstName": "first",
"LastName": "last",
"City": "city",
})[list(kwargs.keys())]
valsLower = pd.Series([val.lower() for val in kwargs.values()], index=kwargs.keys())
mask = reduce(and_, (df[col].str.lower() == valsLower[col] for col in df.columns))
return data[mask]
data = pd.DataFrame({'FirstName':['Fred','Billy'],'LastName':['Bob','Bob'],'City':['Austin','NYC']})
測試代碼:
print( '',"data",data,sep='\n' )
print( '',"first='fred'",find(first='fred'),sep='\n' )
print( '',"first='fReD'",find(first='fred'),sep='\n' )
print( '',"last='bob'",find(last='bob'),sep='\n' )
print( '',"city='austin'",find(city='austin'),sep='\n' )
print( '',"first='fred', city='austin'",find(first='fred', city='austin'),sep='\n' )
print( '',"city='austin', first='fred'",find(first='fred', city='austin'),sep='\n' )
print( '',"last='bob', city='austin'",find(last='bob', city='austin'),sep='\n' )
print( '',"first='billy', city='austin'",find(first='billy', city='austin'),sep='\n' )
示例輸出:
data
FirstName LastName City
0 Fred Bob Austin
1 Billy Bob NYC
first='fred'
FirstName LastName City
0 Fred Bob Austin
first='fReD'
FirstName LastName City
0 Fred Bob Austin
last='bob'
FirstName LastName City
0 Fred Bob Austin
1 Billy Bob NYC
city='austin'
FirstName LastName City
0 Fred Bob Austin
first='fred', city='austin'
FirstName LastName City
0 Fred Bob Austin
city='austin', first='fred'
FirstName LastName City
0 Fred Bob Austin
last='bob', city='austin'
FirstName LastName City
0 Fred Bob Austin
first='billy', city='austin'
Empty DataFrame
Columns: [FirstName, LastName, City]
Index: []
使用match
功能。
import re
from functools import reduce
def find(df, **kwargs):
# Using AND condition. Modify & to | for OR condition.
cond = reduce(lambda prev, x: prev & df[x[0]].str.match(f'{x[1]}', flags=re.IGNORECASE),
kwargs.items(),
True)
return df[cond]
find(df, FirstName='fre', LastName='bob')
# FirstName LastName City
# 0 Fred Bob Austin
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.