简体   繁体   中英

Deleting observations when the value of a variable is 0 for that observation using loops in Python

I have a dataset looking like this pattern:

person  x1  x2  x3 
1       0   0   1 
2       0   1   0 
3       1   0   0 
4       0   1   1

I want to create a loop through x1 to x3 to delete observations (person) whenever x1 is 0, and then x2 is 0, then x3 is 0. Each time I will have a new dataframe.

I've tried something like this

df = pd.read_csv(the input file above)
for n in range(1,4):
omit = (df['x'n] == 0)
dataset[n] = df.loc[~omit]

But it doesn't work, and I don't even understand the error report. Can someone help me?

One approach is to create a dictionary of dataframes corresponding to which x_ variable you are using to delete observations.

df = pd.DataFrame({'person': {0: 1, 1: 2, 2: 3, 3: 4}, 
                   'x1': {0: 0, 1: 0, 2: 1, 3: 0}, 
                   'x2': {0: 0, 1: 1, 2: 0, 3: 1}, 
                   'x3': {0: 1, 1: 0, 2: 0, 3: 1}})

dfs = {k:df.loc[df[f'{k}'].ne(0)] for k in ['x1','x2','x3']}

You can then access each dataframe with, eg, dfs['x1']

    person  x1  x2  x3
2       3   1   0   0 

It seems like this may be what you're trying to do as well. With some modifications, your code can accomplish the same task:

dataset = {}
for n in range(1,4):
    omit = (df[f'x{n}'] == 0)
    dataset[n] = df.loc[~omit]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM