简体   繁体   中英

find index of element in list in dataframe

I have:

adj           response                                                   

"beautiful"    ["beautiful", "beautiful2", "beautifu3"]
"good1"        ["beautiful1", "beautiful2", "beautifu3"]
"hideous"      ["hideous23r", "hideous", "hidoeous"] 

I would like an extra column with the first index of the item in the previous column:

adj           response                                                   index

"beautiful"    ["beautiful", "beautiful2", "beautifu3"]                    0
"not there"    ["beautiful1", "beautiful2", "beautifu3"]                   None
"hideous"      ["hideous23r", "hideous", "hidoeous"]                       1

TRY:

df['response'] = df['response'].apply(eval) # do not use this if column dtype is list
df['index'] = df.apply(lambda x: None if x['adj'] not in x['response'] else x['response'].index(x['adj']),1)

OUTPUT:

         adj                             response  index
0  beautiful   [beautiful, beautiful2, beautifu3]    0.0
1      good1  [beautiful1, beautiful2, beautifu3]    NaN
2    hideous       [hideous23r, hideous, hideous]    1.0

Let us try unpack the list

s = pd.DataFrame(df.response.tolist()).eq(df.adj,0)
df['new'] = s.idxmax(1).where(s.any(1))
df
Out[30]: 
         adj                             response  new
0  beautiful   [beautiful, beautiful2, beautifu3]  0.0
1  not there  [beautiful1, beautiful2, beautifu3]  NaN
2    hideous       [hideous23r, hideous, hideous]  1.0

A really naive way to do it:

import pandas as pd

df = pd.read_csv("h.csv", sep=";")
adj = df["adj"].to_list()
response = df["response"].to_list()
nresponse = []
for i in response:
    list_response = i.split(",")
    remove_char = ["[", "]", "\"", " "]
    for j in range(len(list_response)):
        for char in remove_char:
            list_response[j] = list_response[j].replace(char, "")
    nresponse.append(list_response)
indexes = []
for i in range(len(nresponse)):
    if adj[i] in nresponse[i]:
        x = nresponse[i].index(adj[i])
        indexes.append(x)
    else:
        indexes.append(None)
df["index"] = indexes

(Assuming the list in "response" corresponds to a string) This is way worst and I assume way slower than Nk03 solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM