I have been trying to solve this for a while now but have not yet gotten anywhere. My goal is to search a string in a column called 'WORDS' and return the 'INDEXED_NUMBER'. For example, if I searched 'aaa', it should return me 0 as shown in the table below.
I am using python panda and possibly is trying numpy as well. Below is a sample of code I've tried:
def WordToIndexwithjustPanda():
referenceDF[referenceDF['WORDS'].str.contains('aaa')]
#I was hoping that it will grab me the row with the word 'aaa' but
#it is not returning me anything
and
def WordToIndexwithNumpy():
np.where(referenceDF["WORDS"].str.contains('aaa'))
#I think this is wrong but I am not sure how is this wrong
I hope you guys can guide me to the right way of using this. I am using anaconda prompt and jupyter notebook as an additional note. I have imported panda and numpy.
Thanks in advance. XD
Use loc
with boolean indexing
and dont forget add return
to fuction, also for return scalar need iat
for select first value of filtered Series
with if-else
if filtering return no rows:
def WordToIndexwithjustPanda():
a = referenceDF.loc[referenceDF['WORDS'].str.contains('aaa'), 'INDEXED_NUMBER']
return 'No match' if a.empty else a.iat[0]
You can use also parameter in function for check first occurence of value:
referenceDF = pd.DataFrame({
'WORDS': ['aaa','aaas','aactive','aadvantage','aaker'],
'INDEXED_NUMBER': list(range(5))
})
print (referenceDF)
INDEXED_NUMBER WORDS
0 0 aaa
1 1 aaas
2 2 aactive
3 3 aadvantage
4 4 aaker
def WordToIndexwithjustPanda(val):
a = referenceDF.loc[referenceDF['WORDS'].str.contains(val), 'INDEXED_NUMBER']
return 'No match' if a.empty else a.iat[0]
print (WordToIndexwithjustPanda('aaa'))
0
print (WordToIndexwithjustPanda('bbb'))
No match
This is one way to implement your algorithm using a generator:
def WordToIndexwithjustPanda():
return next((i for i, j in zip(df['INDEXED_NUMBER', df['WORDS']) \
if 'aaa' in j), 'No match')
Strictly speaking it uses pandas only partially in that it uses the iterative functionality of pd.Series
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.