[英]Pandas dataframe - check if multiple rows have the same value
I have a DataFrame which looks like this:我有一个 DataFrame,它看起来像这样:
Reference![]() |
Value![]() |
---|---|
String1![]() |
1 ![]() |
String2![]() |
0 ![]() |
String3![]() |
-1 ![]() |
String2![]() |
1 ![]() |
String1![]() |
1 ![]() |
String3![]() |
0 ![]() |
Each reference can appear in the dataframe either once, two times, or three times;每个引用可以在 dataframe 中出现一次、两次或三次; and can have either the same or different value associated.
并且可以关联相同或不同的值。 I would like to create another dataframe which tells me, for each Reference, do they all have the same value or not.
我想创建另一个 dataframe,它告诉我,对于每个参考,它们是否都具有相同的值。 So with the example above, I would like to get something like this:
所以对于上面的例子,我想得到这样的东西:
Reference![]() |
Value![]() |
---|---|
String1![]() |
Yes![]() |
String2![]() |
No![]() |
String3![]() |
No![]() |
(I put Yes and No as an example but it could be 1/0 or whatever else) (我以是和否为例,但它可以是 1/0 或其他任何东西)
How can I do this?我怎样才能做到这一点?
My initial thought was to use a .groupby
but then I didn't find any type of aggregation which would help me here...我最初的想法是使用
.groupby
但后来我没有找到任何类型的聚合可以帮助我......
You could use groupby
+ nunique
to get a count of unique Values for each Reference.您可以使用
groupby
+ nunique
来计算每个引用的唯一值。 Then use np.where
to assign Yes/No values depending on if the number of unique values is 1 or not:然后使用
np.where
根据唯一值的数量是否为 1 来分配是/否值:
out = df.groupby('Reference', as_index=False)['Value'].nunique()
out['Value'] = np.where(out['Value'].eq(1), 'Yes', 'No')
Output: Output:
Reference Value
0 String1 Yes
1 String2 No
2 String3 No
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.