简体   繁体   English

为什么我的代码 if elif else 适用于所有人?

[英]why does my code if elif else apply to all?

can I have solution for this let say, I have this我可以为此提供解决方案吗,我有这个

df['Location'] *run and i got this df['Location'] *运行,我得到了这个

0           New York, NY
1          Chantilly, VA
2             Boston, MA
3             Newton, MA
4           New York, NY
             ...        
667         Fort Lee, NJ
668    San Francisco, CA
669        Irwindale, CA
670    San Francisco, CA
671         New York, NY
Name: Location, Length: 659, dtype: object

then I want to make it simplified by if it contain Nwe York, NY then I want it become NY. If it contain Boston, MA then I want it become MA. Etc

so I write this code

def clean_location_1(x):
    if 'CA':
        return 'CA'
    elif 'NY':
        return 'NY'
    elif 'DC':
        return 'DC'
    elif 'MA':
        return 'MA'
    elif 'IL':
        return 'IL'
    elif 'VA':
        return 'VA'
    else:
        return 'others'


df['Location'] = df['Location'].apply(clean_location_1)

but, when I run my script, all the Location become CA

how can I solve this?

One of possible solutions of solving this using your approach is the following.使用您的方法解决此问题的可能解决方案之一如下。 import pandas as pd将熊猫导入为 pd

data = pd.DataFrame([{'location': 'New York, NY'},
                     {'location': 'Chantilly, VA'},
                     {'location': 'Boston, MA'},
                     {'location': 'Newton, MA'},
                     {'location': 'San Francisco, CA'}])

def clean_location_1(x):
    if 'CA' in x:
        return 'CA'
    elif 'NY' in x:
        return 'NY'
    elif 'DC' in x:
        return 'DC'
    elif 'MA' in x:
        return 'MA'
    elif 'IL' in x:
        return 'IL'
    elif 'VA' in x:
        return 'VA'
    else:
        return 'others'

data['location'].apply(clean_location_1)

Your problem was incorrect condition in the if/else block.您的问题是 if/else 块中的条件不正确。

Another way of doing this might be.这样做的另一种方法可能是。

list_states = ['CA', 'NY', 'DC', 'MA', 'IL', 'VA']
data['location'].apply(lambda x: x.split(' ')[-1] if x.split(' ')[-1] in list_states else 'others')

Then you won't need a huge if/else block.那么你将不需要一个巨大的 if/else 块。

When you write if 'CA' it doesn't mean much, you have to check the value.当你写if 'CA'它没有多大意义时,你必须检查它的值。

This should do it using pd.Series.str.contains :这应该使用pd.Series.str.containspd.Series.str.contains

def clean_location_1(x):
    if x.str.contains('CA'):
        return 'CA'
    elif x.str.contains('NY'):
        return 'NY'
    elif x.str.contains('DC'):
        return 'DC'
    elif x.str.contains('MA'):
        return 'MA'
    elif x.str.contains('IL'):
        return 'IL'
    elif x.str.contains('VA'):
        return 'VA'
    else:
        return 'others'

The problem is simple.问题很简单。 You are not comparing the string with x.您不是将字符串与 x 进行比较。 And 'CA' will always return true as non empty strings are truthy.并且 'CA' 将始终返回 true,因为非空字符串是真实的。 That is why everything changes to CA这就是为什么一切都变成了 CA

Doing正在做

if "<str>":

always returns True and that means in your code, it will always return CA .始终返回True ,这意味着在您的代码中,它将始终返回CA So, you can try this, check if x is in <word> or not.所以,你可以试试这个,检查x是否在<word>

def clean_location_1(x):
    if 'CA' in x:
        return 'CA'
    elif 'NY' in x:
        return 'NY'
    elif 'DC' in x:
        return 'DC'
    elif 'MA' in x:
        return 'MA'
    elif 'IL' in x:
        return 'IL'
    elif 'VA' in x:
        return 'VA'
    else:
        return 'others'
df['Location'] = df['Location'].apply(clean_location_1)

Or you can try this, which is easy, clean and simple:或者你可以试试这个,它很容易、干净和简单:

check=["CA","NY","DC","MA","IL","VA"]
def clean_location_1(x):
    y=x.rsplit(", ",1)[1]
    if y in check:
        return y
    else:
        return "others"

df['Location'] = df['Location'].apply(clean_location_1)

Here we are creating the list of short form of locations, as you did in every if-else statements and storing that in check and checking that if x has values of check or not.在这里,我们正在创建位置list of short form ,就像您在每个if-else语句中所做的那样,并将其存储在check并检查x是否具有check values

Or one-liner solution, same as second approach but in one line:或单行解决方案,与第二种方法相同,但在一行中:

check=["CA","NY","DC","MA","IL","VA"]
df['Location'] = df['Location'].apply(lambda x: x.rsplit(", ",1)[1] if x.rsplit(", ",1)[1] in check else "others")

You can do:你可以做:

states = ['CA', 'NY', 'DC', 'MA', 'IL', 'VA']
df['State'] = df['Location'].str.split(', ', expand=True)[1] \
                             .rename('State').to_frame().query('State in @states')
df['State'] = df['State'].fillna('other')
>>> df
            Location  State
0       New York, NY     NY
1      Chantilly, VA     VA
2         Boston, MA     MA
3         Newton, MA     MA
4       New York, NY     NY
5       Fort Lee, NJ  other
6  San Francisco, CA     CA
7      Irwindale, CA     CA
8  San Francisco, CA     CA
9       New York, NY     NY

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM