[英]Excel If one column contains unique values, and another column contains one true value, return all true values for those unique values
I have a large file with over 78K rows in Exel (365 version).我在 Exel(365 版本)中有一个超过 78K 行的大文件。 I am trying to write a formula that will return a True
or False
value that is contingent on unique values in Column A
(21K unique values) AND if any of the values in Column B
are True
, then Column C
should return a True
value for that range of unique values in Column A
.我正在尝试编写一个公式,该公式将返回一个True
或False
值,该值取决于Column A
中的唯一值(21K 唯一值),并且如果Column B
中的任何值是True
,那么Column C
应该返回一个True
值Column A
中的唯一值范围。
For example, I have the following data:例如,我有以下数据:
Column A Column B
1 True
1 False
1 False
2 False
2 False
3 False
3 True
I want Column C
to show the following:我希望Column C
显示以下内容:
Column A Column B Column C
1 True True
1 False True
1 False True
2 False False
2 False False
3 False True
3 True True
In other words, for every unique value in Column A
, and if any of the corresponding values in Column B
are True
, I want all values in Column C
to state True
.换句话说,对于Column A
中的每个唯一值,并且如果Column B
中的任何相应值是True
,我希望Column C
中的所有值到 state True
。
After many different attempts at various formulas, I think I may found something close with the following formula, but it returns True
for every cell.在对各种公式进行了多次不同尝试之后,我想我可能会发现与以下公式接近的东西,但它对每个单元格都返回True
。 I'm not sure what I'm missing.我不确定我错过了什么。
=+IF(AND(UNIQUE($A$1:$A$7)),COUNTIF($B$1:$B$7,"TRUE")>0,1)
My data doesn't have any missing values.我的数据没有任何缺失值。
I've searched this site for what I'm attempting, but the formula above was the closest I could come.我已经在这个网站上搜索了我正在尝试的内容,但上面的公式是我能找到的最接近的公式。 This thread is close, but not quite what I'm looking for. 这个线程很接近,但不是我想要的。
I know that I could do this manually with the following formula, but with over 21K unique values in Column A
, I don't want to do this manually if I don't have to.我知道我可以使用以下公式手动执行此操作,但是Column A
中有超过 21K 的唯一值,如果不需要,我不想手动执行此操作。
=+COUNTIF($B$1:$B$3,"TRUE")>0
If this is easier to perform in Python, that code would be helpful.如果这在 Python 中更容易执行,那么该代码将很有帮助。 I am new to Python, and more comfortable with Excel, but understand Python may be easier and quicker.我是 Python 的新手,对 Excel 更熟悉,但了解 Python 可能更容易、更快捷。
This is how I would handle this in pandas.这就是我在 pandas 中处理这个问题的方式。
print(df)
#note i've added in a non duplicated row for testing.
Column_A Column_B
0 1 True
1 1 False
2 1 False
3 2 False
4 2 False
5 3 False
6 3 True
7 4 True
First I would write two boolean expressions, the first - to see if any of the values are duplicates the second to see if Column_B contains any True values.首先,我将编写两个 boolean 表达式,第一个 - 查看是否有任何值重复,第二个查看 Column_B 是否包含任何 True 值。 if both equate to True I want to pass all the ID`s from column A into a list.如果两者都等于 True 我想将 A 列中的所有 ID 传递到列表中。
vals = df.loc[df.duplicated(subset=["Column_A"], keep=False)
& df["Column_B"].eq(True),
"Column_A"].tolist()
print(vals)
[1, 3]
now that we know what the values are we can write a simple boolean assignment.现在我们知道了值是什么,我们可以编写一个简单的 boolean 赋值。
df['Column_C'] = df['Column_A'].isin(vals)
print(df)
Column_A Column_B Column_C
0 1 True True
1 1 False True
2 1 False True
3 2 False False
4 2 False False
5 3 False True
6 3 True True
7 4 True False
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.