简体   繁体   中英

how to python loop with conditions

I have been trying to code this for a while. here is a sample dataframe:

capacity = 500
s = pd.Series(['School 1','School 2', 'School 3','School 4', 'School 5'])
p = pd.Series(['132', '458', '333', '300', '258'])
d = pd.Series(['1', '2', '3', '4', '5'])

df = pd.DataFrame(np.c_[s,p,d],columns = ['School Name','Population', 'Distance'])

What I want to do is to make loop where loop will continually subtract the 'Population' from the 'capacity' as long as it does not exceed the capacity. It would need to check the 'Distance' for the order.

example: Since 'School 1' is the nearest it subtracts 132 from 500 which is 368. But since 'School 2' is the next nearest but the population exceeds 368 (458>368), it would stop here, it would no longer continue to check the next nearest School which is 'School 3'.

After this it should then assign the school name in to another column

end result would be:

s = pd.Series(['School 1','School 2', 'School 3','School 4', 'School 5'])
p = pd.Series(['132', '458', '333', '300', '258'])
d = pd.Series(['1', '2', '3', '4', '5'])
sn = pd.Series(['School 1', 0, 0 ,0 ,0])
df2 = pd.DataFrame(np.c_[s,p,d,sn],columns = ['School Name','Population', 'Distance','Included'])

Been trying to work on this since yesterday, still have no clue how to do it except manually. Still a beginner python user.

Thanks for the help!

Based on your question, I am assuming that you want just one school name right before the capacity is exceeded. That could be achieved like this:

import pandas as pd
import numpy as np

capacity = 500

s = pd.Series(['School 1','School 2', 'School 3','School 4', 'School 5'])
p = pd.Series(['132', '458', '333', '300', '258'])
d = pd.Series(['1', '2', '3', '4', '5'])
df = pd.DataFrame(np.c_[s,p,d],columns = ['School Name','Population', 'Distance'])

# converting population to integer values
p = p.astype('int')

# placeholder to store school name
school_name = None

for idx, val in enumerate(p):
  # keep assigning school name until capacity is exceeded
  capacity -= val
  if capacity < 0:
      break
  school_name = s[idx]

# add included column     
df['included'] = np.where(df['School Name'] == school_name, df['School Name'], 0)

Then you can print the df to see that it works indeed:

>>> df1
School Name Population Distance    included
0    School 1        132        1    School 1
1    School 2        458        2           0
2    School 3        333        3           0
3    School 4        300        4           0
4    School 5        258        5           0

However, let's say that you want to keep all the schools until the capacity gets exceeded, it is very simply to modify the above program .. just replace the placeholder and the loop like this:

school_names = []    # placeholder will be a list now
for idx, val in enumerate(p):
    capacity -= val
    if capacity < 0:
        break
    school_names.append(s[idx])    # keep adding schools that do not exceed capacity to the list

# Instead of equality, check if school name is in your list
df['included'] = np.where(df['School Name'].isin(school_names), df['School Name'], 0)

Now, if your capacity = 500 and you change the 2nd population such that p = pd.Series(['132', '128', '333', '300', '258']) then both School 1 and School 2 would be included.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM