在python中使用數據框實現功能

Question

我有這個問題，我被困了很多天。

我有這個功能：

def cal_score(research, citations, teaching, international, income):
     return .3 **research + .3 **citations + .3 **teaching +.075 **international + .025 **income

其中“研究”，“引用”，“教學”，“國際”和“收入”是數據集的列。 我想在數據集中添加一個新列，其值應根據上述函數進行計算。 我嘗試了不同的步驟，但沒有任何效果。

示例：如果我們有如下一行

university_name  Indian Institute of Technology Bombay


teaching  43.8

international  14.3

research  24.2

citations  8,327

income   14.9

Total Score Ranking

然后，總分應計算為

Total Score =  .3 **research + .3 **citations + .3 **teaching +.075 **international + .025 **income.

這應適用於數據集中的所有行。

誰能幫我實現這一要求。 我在這個問題上停留了一段時間。 :-(

Indian_univ.head（10）.to_dict（）

{'citations': {510: 38.799999999999997,
  832: 39.0,
  856: 45.600000000000001,
  959: 45.799999999999997,
  1232: 84.700000000000003,
  1360: 38.5,
  1361: 41.799999999999997,
  1362: 35.299999999999997,
  1363: 53.600000000000001,
  1679: 51.600000000000001},
 'country': {510: 'India',
  832: 'India',
  856: 'India',
  959: 'India',
  1232: 'India',
  1360: 'India',
  1361: 'India',
  1362: 'India',
  1363: 'India',
  1679: 'India'},
 'female_male_ratio': {510: '16 : 84',
  832: '15 : 85',
  856: '16 : 84',
  959: '17 : 83',
  1232: '46 : 54',
  1360: '18 : 82',
  1361: '13 : 87',
  1362: '15 : 85',
  1363: '17 : 83',
  1679: '19 : 81'},
 'income': {510: '24.2',
  832: '72.4',
  856: '52.7',
  959: '70.4',
  1232: '28.4',
  1360: '-',
  1361: '42.4',
  1362: '-',
  1363: '64.8',
  1679: '37.9'},
 'international': {510: '14.3',
  832: '16.1',
  856: '19.9',
  959: '15.6',
  1232: '29.3',
  1360: '15.3',
  1361: '17.3',
  1362: '14.7',
  1363: '15.6',
  1679: '18.2'},
 'international_students': {510: '1%',
  832: '0%',
  856: '1%',
  959: '1%',
  1232: '1%',
  1360: '1%',
  1361: '0%',
  1362: '0%',
  1363: '1%',
  1679: '1%'},
 'num_students': {510: '8,327',
  832: '9,928',
  856: '8,327',
  959: '8,061',
  1232: '16,691',
  1360: '8,371',
  1361: '6,167',
  1362: '9,928',
  1363: '8,061',
  1679: '3,318'},
 'research': {510: 15.699999999999999,
  832: 45.299999999999997,
  856: 33.100000000000001,
  959: 13.699999999999999,
  1232: 14.0,
  1360: 23.0,
  1361: 25.199999999999999,
  1362: 30.0,
  1363: 12.300000000000001,
  1679: 39.5},
 'student_staff_ratio': {510: 14.9,
  832: 17.5,
  856: 14.9,
  959: 18.699999999999999,
  1232: 23.899999999999999,
  1360: 17.300000000000001,
  1361: 12.199999999999999,
  1362: 17.5,
  1363: 18.699999999999999,
  1679: 8.1999999999999993},
 'teaching': {510: 43.799999999999997,
  832: 44.200000000000003,
  856: 47.299999999999997,
  959: 30.399999999999999,
  1232: 25.800000000000001,
  1360: 33.799999999999997,
  1361: 31.300000000000001,
  1362: 39.299999999999997,
  1363: 25.100000000000001,
  1679: 32.600000000000001},
 'total_score': {510: 29.489999999999995,
  832: 38.549999999999997,
  856: 37.799999999999997,
  959: 26.969999999999999,
  1232: 37.350000000000001,
  1360: 28.589999999999996,
  1361: 29.489999999999998,
  1362: 31.379999999999995,
  1363: 27.299999999999997,
  1679: 37.109999999999999},
 'university_name': {510: 'Indian Institute of Technology Bombay',
  832: 'Indian Institute of Technology Kharagpur',
  856: 'Indian Institute of Technology Bombay',
  959: 'Indian Institute of Technology Roorkee',
  1232: 'Panjab University',
  1360: 'Indian Institute of Technology Delhi',
  1361: 'Indian Institute of Technology Kanpur',
  1362: 'Indian Institute of Technology Kharagpur',
  1363: 'Indian Institute of Technology Roorkee',
  1679: 'Indian Institute of Science'},
 'world_rank': {510: '301-350',
  832: '226-250',
  856: '251-275',
  959: '351-400',
  1232: '226-250',
  1360: '351-400',
  1361: '351-400',
  1362: '351-400',
  1363: '351-400',
  1679: '276-300'},
 'year': {510: 2012,
  832: 2013,
  856: 2013,
  959: 2013,
  1232: 2014,
  1360: 2014,
  1361: 2014,
  1362: 2014,
  1363: 2014,
  1679: 2015}}

Answer 1

我認為您可以使用：

df['Total Score'] = .3 **df.research + 
                    .3 **df.citations + 
                    .3 **df.teaching + 
                    .075 **df.international + 
                    .025 **df.income

如果需要apply功能，通常會比較慢：

def cal_score(x):
     return .3 **x.research + 
            .3 **x.citations + 
            .3 **x.teaching +
            .075 **x.international + 
            .025 **x.income

df['Total Score'] = df.apply(cal_score, axis=1)

編輯數據：

您需要先replace num_students和income列，然后按astype轉換為float ：

EDIT2按數據樣本：

import pandas as pd

df = pd.DataFrame({'citations': {510: 38.799999999999997, 832: 39.0, 856: 45.600000000000001, 959: 45.799999999999997, 1232: 84.700000000000003, 1360: 38.5, 1361: 41.799999999999997, 1362: 35.299999999999997, 1363: 53.600000000000001, 1679: 51.600000000000001}, 'country': {510: 'India', 832: 'India', 856: 'India', 959: 'India', 1232: 'India', 1360: 'India', 1361: 'India', 1362: 'India', 1363: 'India', 1679: 'India'}, 'female_male_ratio': {510: '16 : 84', 832: '15 : 85', 856: '16 : 84', 959: '17 : 83', 1232: '46 : 54', 1360: '18 : 82', 1361: '13 : 87', 1362: '15 : 85', 1363: '17 : 83', 1679: '19 : 81'}, 'income': {510: '24.2', 832: '72.4', 856: '52.7', 959: '70.4', 1232: '28.4', 1360: '-', 1361: '42.4', 1362: '-', 1363: '64.8', 1679: '37.9'}, 'international': {510: '14.3', 832: '16.1', 856: '19.9', 959: '15.6', 1232: '29.3', 1360: '15.3', 1361: '17.3', 1362: '14.7', 1363: '15.6', 1679: '18.2'}, 'international_students': {510: '1%', 832: '0%', 856: '1%', 959: '1%', 1232: '1%', 1360: '1%', 1361: '0%', 1362: '0%', 1363: '1%', 1679: '1%'}, 'num_students': {510: '8,327', 832: '9,928', 856: '8,327', 959: '8,061', 1232: '16,691', 1360: '8,371', 1361: '6,167', 1362: '9,928', 1363: '8,061', 1679: '3,318'}, 'research': {510: 15.699999999999999, 832: 45.299999999999997, 856: 33.100000000000001, 959: 13.699999999999999, 1232: 14.0, 1360: 23.0, 1361: 25.199999999999999, 1362: 30.0, 1363: 12.300000000000001, 1679: 39.5}, 'student_staff_ratio': {510: 14.9, 832: 17.5, 856: 14.9, 959: 18.699999999999999, 1232: 23.899999999999999, 1360: 17.300000000000001, 1361: 12.199999999999999, 1362: 17.5, 1363: 18.699999999999999, 1679: 8.1999999999999993}, 'teaching': {510: 43.799999999999997, 832: 44.200000000000003, 856: 47.299999999999997, 959: 30.399999999999999, 1232: 25.800000000000001, 1360: 33.799999999999997, 1361: 31.300000000000001, 1362: 39.299999999999997, 1363: 25.100000000000001, 1679: 32.600000000000001}, 'total_score': {510: 29.489999999999995, 832: 38.549999999999997, 856: 37.799999999999997, 959: 26.969999999999999, 1232: 37.350000000000001, 1360: 28.589999999999996, 1361: 29.489999999999998, 1362: 31.379999999999995, 1363: 27.299999999999997, 1679: 37.109999999999999}, 'university_name': {510: 'Indian Institute of Technology Bombay', 832: 'Indian Institute of Technology Kharagpur', 856: 'Indian Institute of Technology Bombay', 959: 'Indian Institute of Technology Roorkee', 1232: 'Panjab University', 1360: 'Indian Institute of Technology Delhi', 1361: 'Indian Institute of Technology Kanpur', 1362: 'Indian Institute of Technology Kharagpur', 1363: 'Indian Institute of Technology Roorkee', 1679: 'Indian Institute of Science'}, 'world_rank': {510: '301-350', 832: '226-250', 856: '251-275', 959: '351-400', 1232: '226-250', 1360: '351-400', 1361: '351-400', 1362: '351-400', 1363: '351-400', 1679: '276-300'}, 'year': {510: 2012, 832: 2013, 856: 2013, 959: 2013, 1232: 2014, 1360: 2014, 1361: 2014, 1362: 2014, 1363: 2014, 1679: 2015}})

#replace , to empty string
df['num_students'] = df.num_students.str.replace(',', '')
#replace - to '0'
df['income'] = df['income'].str.replace('-', '0')

#convert columns to float
df[['teaching', 'international', 'research', 'citations', 'income']] = 
df[['teaching', 'international', 'research', 'citations', 'income']].astype(float)

df['Total Score'] = .3 **df.research + 
                    .3 **df.citations +  
                    .3 **df.teaching +  
                    .075 **df.international +  
                    .025 **df.income

print (df)

      citations country female_male_ratio  income  international  \
510        38.8   India           16 : 84    24.2           14.3   
832        39.0   India           15 : 85    72.4           16.1   
856        45.6   India           16 : 84    52.7           19.9   
959        45.8   India           17 : 83    70.4           15.6   
1232       84.7   India           46 : 54    28.4           29.3   
1360       38.5   India           18 : 82     0.0           15.3   
1361       41.8   India           13 : 87    42.4           17.3   
1362       35.3   India           15 : 85     0.0           14.7   
1363       53.6   India           17 : 83    64.8           15.6   
1679       51.6   India           19 : 81    37.9           18.2   

     international_students num_students  research  student_staff_ratio  \
510                      1%         8327      15.7                 14.9   
832                      0%         9928      45.3                 17.5   
856                      1%         8327      33.1                 14.9   
959                      1%         8061      13.7                 18.7   
1232                     1%        16691      14.0                 23.9   
1360                     1%         8371      23.0                 17.3   
1361                     0%         6167      25.2                 12.2   
1362                     0%         9928      30.0                 17.5   
1363                     1%         8061      12.3                 18.7   
1679                     1%         3318      39.5                  8.2   

      teaching  total_score                           university_name  \
510       43.8        29.49     Indian Institute of Technology Bombay   
832       44.2        38.55  Indian Institute of Technology Kharagpur   
856       47.3        37.80     Indian Institute of Technology Bombay   
959       30.4        26.97    Indian Institute of Technology Roorkee   
1232      25.8        37.35                         Panjab University   
1360      33.8        28.59      Indian Institute of Technology Delhi   
1361      31.3        29.49     Indian Institute of Technology Kanpur   
1362      39.3        31.38  Indian Institute of Technology Kharagpur   
1363      25.1        27.30    Indian Institute of Technology Roorkee   
1679      32.6        37.11               Indian Institute of Science   

     world_rank  year   Total Score  
510     301-350  2012  6.177371e-09  
832     226-250  2013  7.776087e-19  
856     251-275  2013  4.928529e-18  
959     351-400  2013  6.863746e-08  
1232    226-250  2014  4.782972e-08  
1360    351-400  2014  1.000000e+00  
1361    351-400  2014  6.664022e-14  
1362    351-400  2014  1.000000e+00  
1363    351-400  2014  3.703322e-07  
1679    276-300  2015  9.003721e-18

Answer 2

這是最直接的方法：

df.assign(TotalScore=.3 **df.research + .3 **df.citations + .3 **df.teaching +.075 **df.international + .025 **df.income)

在python中使用數據框實現功能

問題描述

2 個解決方案

解決方案1
3 已采納 2016-08-23 05:00:55

解決方案2
1 2016-08-23 05:03:35

在python中使用數據框實現功能

問題描述

2 個解決方案

解決方案1 3 已采納 2016-08-23 05:00:55

解決方案2 1 2016-08-23 05:03:35

解決方案1
3 已采納 2016-08-23 05:00:55

解決方案2
1 2016-08-23 05:03:35