[英]Access rows of dataframe based on a column of lists. If a unique string is inside the list, the row will be be viewed
This is difficult to explain, so I haven't been able to google my problem.这很难解释,所以我无法用谷歌搜索我的问题。
I have a dataframe.我有一个数据框。 A column of that dataframe contains lists.
该数据框的一列包含列表。 Each row has a list of strings.
每行都有一个字符串列表。 The lists are of various size.
列表大小不一。 Some rows don't have a list, but a NaN value.
有些行没有列表,但有 NaN 值。
I want to be able to view rows of the dataframe that contain an arbitrary string in their list.我希望能够查看在其列表中包含任意字符串的数据框行。 So if I want to find all rows that have a list that contains "arbitrary_string" as an element of the list, those rows will be selected.
因此,如果我想查找包含“arbitrary_string”作为列表元素的列表的所有行,将选择这些行。
Here is an image indicating an example dataframe.这是指示示例数据帧的图像。
I want to use the term "corndog" to return a view of row 1 and 2. The location in the list of the string does not matter.我想使用术语“corndog”来返回第 1 行和第 2 行的视图。字符串列表中的位置无关紧要。 My associates suggested I try to use lambdas and apply and a special function together.
我的同事建议我尝试使用 lambdas 和 apply 以及一个特殊的函数。 Their examples haven't worked for me.
他们的例子对我不起作用。
They propose:他们提议:
def find_id(inpList:list,inpstr):
print(inpList)
for x in inpList:
if inpstr in x:
return(1)
return(0)
Df[list_of_strings].apply(lambda x: find_id(x, cust string))
I'm not really sure what I'm doing.我不确定我在做什么。 I don't understand how these things could be pieced together.
我不明白这些东西怎么能拼凑起来。
IIUC, I think you can use this: IIUC,我想你可以用这个:
Original df:原始df:
+----+-----------+--------------+----------------------------+
| | some_int | some_string | List_of_strings |
+----+-----------+--------------+----------------------------+
| 0 | 84 | something | [‘cat’,’dog’,’corndog’] |
| 1 | 74 | etc | [‘qwetry’,’celphone’] |
| 2 | 64 | etc | [‘dog’,corndog’] |
| 3 | 89 | etc | [‘etc’,’catfish’,’purple’] |
+----+-----------+--------------+----------------------------+
df[df['List_of_strings'].str.contains('corndog')]
Output:输出:
some_int some_string List_of_strings
0 84 something [‘cat’,’dog’,’corndog’]
2 64 etc [‘dog’,corndog’]
EDIT considering column value are of list type and not string you can use following:编辑考虑到列值是列表类型而不是字符串,您可以使用以下内容:
df[df['List_of_strings'].apply(lambda x: 'corndog' in x)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.