[英]shift values from another column python pandas to another column
我是Python的新手,面對並發布以下內容:
a)如果在Total總計包含NAN
總數,如何將值從Rank轉換為Total,使用Rank到Bronze的值
b)如何將Rank的缺失值(在移位值之后)填充到從其上方的行導出的值。
問題:
Rank NOC Gold Silver Bronze Total
0 1 United States (USA) 46 37 38 121
1 2 Argentina (ARG) 3 1 0 4
2 3 Denmark (DEN) 2 6 7 15
3 4 Sweden (SWE) 2 6 3 11
4 5 South Africa (RSA) 2 6 2 10
5 6 Sweden (SWE) 2 6 3 11
**6 Tajikistan (TJK) 1 0 0 1 NaN**
7 7 Malaysia (MAS) 0 4 1 5
預期結果:
Rank NOC Gold Silver Bronze Total
[0 1 United States (USA) 46 37 38 121
1 2 Argentina (ARG) 3 1 0 4
2 3 Denmark (DEN) 2 6 7 15
3 4 Sweden (SWE) 2 6 3 11
4 5 South Africa (RSA) 2 6 2 10
5 6 Sweden (SWE) 2 6 3 11
**6 6 Tajikistan (TJK) 1 0 0 1**
7 7 Malaysia (MAS) 0 4 1 5]
我會這樣做,加上金,銀和銅(有一些重量,以確保黃金計數更多任何數量的銀等)然后你可以使用rank
:
In [11]: (df["Gold"] * 10000 + df["Silver"] * 100 + df["Bronze"])
Out[11]:
0 463738
1 30100
2 20607
3 20603
4 20602
5 20603
6 10000
7 401
dtype: int64
In [12]: (df["Gold"] * 10000 + df["Silver"] * 100 + df["Bronze"]).rank(method='first', ascending=False)
Out[12]:
0 1.0
1 2.0
2 3.0
3 4.0
4 6.0
5 5.0
6 7.0
7 8.0
dtype: float64
這就是我做到的。 它有效,但不確定它是如何優化的。 只要答案對我來說是正確的。
from pandas import DataFrame, Series
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import re
# Step 1
# Cleanup values within NOC and Rank. Start off with changing the values within Total
# Replace the value of Total which is Null to NaN
df.loc[:, 'Total2'] = df['Total'].isnull()
# Step 2
# Filter Total equal to Nan and shift the row values from Rank to Total - Rank to Bronze
df.ix[df.Total2 == True, 'Total'] = df['Bronze']
df.ix[df.Total2 == True, 'Bronze'] = df['Silver']
df.ix[df.Total2 == True, 'Silver'] = df['Gold']
df.ix[df.Total2 == True, 'Gold'] = df['NOC']
df.ix[df.Total2 == True, 'NOC'] = df['Rank']
# Step 3
# Clean up the Rank column. Create a new column which reveal only digit value
df['Rank2'] = pd.to_numeric(df['Rank'], errors='coerce')
df['fill_forward'] = df['Rank2'].fillna(method='ffill')
del df['Rank']
del df['Rank2']
del df['Total2']
df = df.rename(columns={'fill_forward': 'Rank'})
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.