[英]shift values from another column python pandas to another column
我是Python的新手,面对并发布以下内容:
a)如果在Total总计包含NAN
总数,如何将值从Rank转换为Total,使用Rank到Bronze的值
b)如何将Rank的缺失值(在移位值之后)填充到从其上方的行导出的值。
问题:
Rank NOC Gold Silver Bronze Total
0 1 United States (USA) 46 37 38 121
1 2 Argentina (ARG) 3 1 0 4
2 3 Denmark (DEN) 2 6 7 15
3 4 Sweden (SWE) 2 6 3 11
4 5 South Africa (RSA) 2 6 2 10
5 6 Sweden (SWE) 2 6 3 11
**6 Tajikistan (TJK) 1 0 0 1 NaN**
7 7 Malaysia (MAS) 0 4 1 5
预期结果:
Rank NOC Gold Silver Bronze Total
[0 1 United States (USA) 46 37 38 121
1 2 Argentina (ARG) 3 1 0 4
2 3 Denmark (DEN) 2 6 7 15
3 4 Sweden (SWE) 2 6 3 11
4 5 South Africa (RSA) 2 6 2 10
5 6 Sweden (SWE) 2 6 3 11
**6 6 Tajikistan (TJK) 1 0 0 1**
7 7 Malaysia (MAS) 0 4 1 5]
我会这样做,加上金,银和铜(有一些重量,以确保黄金计数更多任何数量的银等)然后你可以使用rank
:
In [11]: (df["Gold"] * 10000 + df["Silver"] * 100 + df["Bronze"])
Out[11]:
0 463738
1 30100
2 20607
3 20603
4 20602
5 20603
6 10000
7 401
dtype: int64
In [12]: (df["Gold"] * 10000 + df["Silver"] * 100 + df["Bronze"]).rank(method='first', ascending=False)
Out[12]:
0 1.0
1 2.0
2 3.0
3 4.0
4 6.0
5 5.0
6 7.0
7 8.0
dtype: float64
这就是我做到的。 它有效,但不确定它是如何优化的。 只要答案对我来说是正确的。
from pandas import DataFrame, Series
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import re
# Step 1
# Cleanup values within NOC and Rank. Start off with changing the values within Total
# Replace the value of Total which is Null to NaN
df.loc[:, 'Total2'] = df['Total'].isnull()
# Step 2
# Filter Total equal to Nan and shift the row values from Rank to Total - Rank to Bronze
df.ix[df.Total2 == True, 'Total'] = df['Bronze']
df.ix[df.Total2 == True, 'Bronze'] = df['Silver']
df.ix[df.Total2 == True, 'Silver'] = df['Gold']
df.ix[df.Total2 == True, 'Gold'] = df['NOC']
df.ix[df.Total2 == True, 'NOC'] = df['Rank']
# Step 3
# Clean up the Rank column. Create a new column which reveal only digit value
df['Rank2'] = pd.to_numeric(df['Rank'], errors='coerce')
df['fill_forward'] = df['Rank2'].fillna(method='ffill')
del df['Rank']
del df['Rank2']
del df['Total2']
df = df.rename(columns={'fill_forward': 'Rank'})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.