简体   繁体   中英

Excel If one column contains unique values, and another column contains one true value, return all true values for those unique values

I have a large file with over 78K rows in Exel (365 version). I am trying to write a formula that will return a True or False value that is contingent on unique values in Column A (21K unique values) AND if any of the values in Column B are True , then Column C should return a True value for that range of unique values in Column A .

For example, I have the following data:

Column A     Column B
1            True
1            False
1            False
2            False
2            False
3            False
3            True

I want Column C to show the following:

Column A     Column B     Column C
1            True         True
1            False        True
1            False        True
2            False        False
2            False        False
3            False        True
3            True         True

In other words, for every unique value in Column A , and if any of the corresponding values in Column B are True , I want all values in Column C to state True .

After many different attempts at various formulas, I think I may found something close with the following formula, but it returns True for every cell. I'm not sure what I'm missing.

=+IF(AND(UNIQUE($A$1:$A$7)),COUNTIF($B$1:$B$7,"TRUE")>0,1)

My data doesn't have any missing values.

I've searched this site for what I'm attempting, but the formula above was the closest I could come. This thread is close, but not quite what I'm looking for.

I know that I could do this manually with the following formula, but with over 21K unique values in Column A , I don't want to do this manually if I don't have to.

=+COUNTIF($B$1:$B$3,"TRUE")>0

If this is easier to perform in Python, that code would be helpful. I am new to Python, and more comfortable with Excel, but understand Python may be easier and quicker.

This is how I would handle this in pandas.

print(df)
#note i've added in a non duplicated row for testing.


   Column_A  Column_B
0         1      True
1         1     False
2         1     False
3         2     False
4         2     False
5         3     False
6         3      True
7         4      True

First I would write two boolean expressions, the first - to see if any of the values are duplicates the second to see if Column_B contains any True values. if both equate to True I want to pass all the ID`s from column A into a list.

vals = df.loc[df.duplicated(subset=["Column_A"], keep=False) 
              & df["Column_B"].eq(True),
             "Column_A"].tolist()

print(vals)

[1, 3]

now that we know what the values are we can write a simple boolean assignment.

df['Column_C'] = df['Column_A'].isin(vals)

print(df)
   Column_A  Column_B  Column_C
0         1      True      True
1         1     False      True
2         1     False      True
3         2     False     False
4         2     False     False
5         3     False      True
6         3      True      True
7         4      True     False

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM