I have a dataset looking like this pattern:
person x1 x2 x3
1 0 0 1
2 0 1 0
3 1 0 0
4 0 1 1
I want to create a loop through x1 to x3 to delete observations (person) whenever x1 is 0, and then x2 is 0, then x3 is 0. Each time I will have a new dataframe.
I've tried something like this
df = pd.read_csv(the input file above)
for n in range(1,4):
omit = (df['x'n] == 0)
dataset[n] = df.loc[~omit]
But it doesn't work, and I don't even understand the error report. Can someone help me?
One approach is to create a dictionary of dataframes corresponding to which x_
variable you are using to delete observations.
df = pd.DataFrame({'person': {0: 1, 1: 2, 2: 3, 3: 4},
'x1': {0: 0, 1: 0, 2: 1, 3: 0},
'x2': {0: 0, 1: 1, 2: 0, 3: 1},
'x3': {0: 1, 1: 0, 2: 0, 3: 1}})
dfs = {k:df.loc[df[f'{k}'].ne(0)] for k in ['x1','x2','x3']}
You can then access each dataframe with, eg, dfs['x1']
person x1 x2 x3
2 3 1 0 0
It seems like this may be what you're trying to do as well. With some modifications, your code can accomplish the same task:
dataset = {}
for n in range(1,4):
omit = (df[f'x{n}'] == 0)
dataset[n] = df.loc[~omit]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.