简体   繁体   English

在 Python 中使用带有字符串值的 If 语句

[英]Using If statement with string values in Python

I have a df where column A is either blank or has a string in it.我有一个 df 列 A 要么是空白要么有一个字符串。 I tried to write the if statement (all columns are strings) below.我尝试在下面编写 if 语句(所有列都是字符串)。 Basically, if there is something (any value) in df[A], then the new column value will be a concatenation of columns A, B and C.基本上,如果 df[A] 中有某些内容(任何值),那么新列值将是列 A、B 和 C 的串联。 If there is no value in df[A], then it will concatenate columns B and C.如果 df[A] 中没有值,那么它将连接列 B 和 C。

the part where it's idf df[A] returns a true or false value, right?它是 idf df[A] 的部分返回真值或假值,对吗? just like if I were to write bool(df[A]).就像我要写 bool(df[A])。 So if the value is true, then it should execute the first block, if not, then it should execute the 'else' block.所以如果值为真,那么它应该执行第一个块,如果不是,那么它应该执行'else'块。

if df[A]:
     df[new_column] = df[column_A] + df[column_B] + df[column_C]
else: 
     df[new_column] = df[column_B]+df[column_C]

I get this error: The truth value of a Series is ambiguous.我得到这个错误:一个系列的真值是模棱两可的。 Use a.empty, a.bool(), a.item(), a.any() or a.all().使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。

this happens because df['A'] returns a object which is Series and a object with some information can never be false like [0,0,0] or [None] so it will always return true if it is object.发生这种情况是因为df['A']返回一个 object ,它是Series和带有某些信息的 object 永远不会像 [0,0,0] 或 [None] 那样为假,因此如果它是 ZA8CFDE6331BD49EB66AC96F8911C,它将始终返回 true。 And pandas series doesn't allow you to compare it as a boolean as it's ambiguous并且 pandas 系列不允许您将其与 boolean 进行比较,因为它是模棱两可的

so try this:所以试试这个:

if df[A].any():
     df[new_column] = df[column_A] + df[column_B] + df[column_C]
else: 
     df[new_column] = df[column_B]+df[column_C]

what this code does is it returns true if there is any value present in whole column.如果整列中存在任何值,则此代码的作用是返回 true。 You can use df[A].all() if you need all elements in column to be true.如果您需要列中的所有元素都为真,则可以使用 df[A].all()。

As far as I understand your question, you want to perform the IF-condition for each element.据我了解您的问题,您希望为每个元素执行 IF 条件。 The "+" seems to be a string concatenation, since there are strings in df['A']. “+”似乎是一个字符串连接,因为 df['A'] 中有字符串。

In this case, you don't need the IF-condition at all, because adding an empty string to another leads to the same result as not adding the string.在这种情况下,您根本不需要 IF 条件,因为将空字符串添加到另一个字符串会导致与不添加字符串相同的结果。

import pandas as pd

d = {'A': ['Mr ', '', 'Mrs '], 'B': ['Max ', 'John ', 'Marie '], 'C': ['Power', 'Doe', 'Curie']}
df = pd.DataFrame(data=d)

df['new'] = df['A'] + df['B'] + df['C']

Results in:结果是:

>>> df
      A       B      C              new
0   Mr     Max   Power     Mr Max Power
1         John     Doe         John Doe
2  Mrs   Marie   Curie  Mrs Marie Curie

In the case that "blank" refers to NaN and not to an empty string you can do the following:如果“空白”指的是 NaN 而不是空字符串,您可以执行以下操作:

df['new'] = df.apply(lambda x: ''.join(x.dropna().astype(str)), axis=1)

Have a look at this question, which seems to be similar: questions 33098383看看这个问题,似乎很相似: 问题33098383

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM