简体   繁体   English

如何根据条件更改pd.DataFrame的值?

[英]How to change value of a pd.DataFrame based on a condition?

I have Fifa dataset and it includes information about football players. 我有FIFA数据集,其中包含有关足球运动员的信息。 One of the features of this dataset is the value of football players but it is in string form such as "$300K" or "$50M". 该数据集的特征之一是足球运动员的价值,但它是字符串形式,例如“ $ 300K”或“ $ 50M”。 How can I delete simply these euro and "M, K" symbol and write their values in same units? 如何简单地删除这些欧元和“ M,K”符号,并以相同单位写入它们的值?

import numpy as np
import pandas as pd

location = r'C:\Users\bemrem\Desktop\Python\fifa\fifa_dataset.csv'

_dataframe = pd.read_csv(location)

_dataframe = _dataframe.dropna()
_dataframe = _dataframe.reset_index(drop=True)
_dataframe = _dataframe[['Name', 'Value', 'Nationality', 'Age', 'Wage', 
'Overall', 'Potential']]

_array = ['Belgium', 'France', 'Brazil', 'Croatia', 'England',' Portugal', 
'Uruguay', 'Switzerland', 'Spain', 'Denmark']

_dataframe = _dataframe.loc[_dataframe['Nationality'].isin(_array)]
_dataframe = _dataframe.reset_index(drop=True) 


print(_dataframe.head())
print()
print(_dataframe.tail())

I tried to convert this Value column but I failed. 我试图转换此“值”列,但失败了。 This is what I get 这就是我得到的

           Name   Value Nationality  Age   Wage  Overall  Potential
0        Neymar   €123M      Brazil   25  €280K       92         94
1     L. Suárez    €97M     Uruguay   30  €510K       92         92
2     E. Hazard  €90.5M     Belgium   26  €295K       90         91
3  Sergio Ramos    €52M       Spain   31  €310K       90         90
4  K. De Bruyne    €83M     Belgium   26  €285K       89         92

              Name Value Nationality  Age Wage  Overall  Potential
4931    A. Kilgour  €40K     England   19  €1K       47         56
4932      R. White  €60K     England   18  €2K       47         65
4933     T. Sawyer  €50K     England   18  €1K       46         58
4934     J. Keeble  €40K     England   18  €1K       46         56
4935  J. Lundstram  €60K     England   18  €1K       46         64

But I want to my output looks like this: 但是我想我的输出看起来像这样:

           Name   Value Nationality  Age   Wage  Overall  Potential
0        Neymar   123      Brazil   25  €280K       92         94
1     L. Suárez    97     Uruguay   30  €510K       92         92
2     E. Hazard  90.5     Belgium   26  €295K       90         91
3  Sergio Ramos    52       Spain   31  €310K       90         90
4  K. De Bruyne    83     Belgium   26  €285K       89         92

              Name Value Nationality  Age Wage  Overall  Potential
4931    A. Kilgour  0.04     England   19  €1K       47         56
4932      R. White  0.06     England   18  €2K       47         65
4933     T. Sawyer  0.05     England   18  €1K       46         58
4934     J. Keeble  0.04     England   18  €1K       46         56
4935  J. Lundstram  0.06     England   18  €1K       46         64

I do not have enough reputation to flag an answer as a duplicate. 我的信誉不足,无法将答案标记为重复。 However, I believe that this will solve your particular question in addition to providing a solution if there is no "K" or "M" in your string. 但是,我相信,如果字符串中没有“ K”或“ M”,那么除了提供解决方案之外,它还能解决您的特定问题。

You will also need to replace $ with in the regex. 您还需要在正则表达式中用替换$

Convert the string 2.90K to 2900 or 5.2M to 5200000 in pandas dataframe 在熊猫数据框中将字符串2.90K转换为2900或将5.2M转换为5200000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM