[英]What is an efficient way to isolate dataframe rows with identical values in specific columns?
[英]Change specific values of dataframe columns, what is the most efficient way?
我需要更改dataframe列中特定项的值,我使用了for循环手动进行此操作,有没有一种方法可以使用成语或.where效率更高? 我相信下面的代码不是最好的方法...
# change the names of the countries as requested
for index, row in energy.iterrows(): #change the name of specific
countries
if energy.loc[index, ['Country']].str.contains('United States of
America').bool():
energy.loc[index, ['Country']] = 'United States'
print(energy.loc[index, ['Country']])
if energy.loc[index, ['Country']].str.contains('Republic of
Korea').bool():
energy.loc[index, ['Country']] = 'South Korea'
print(energy.loc[index, ['Country']])
if energy.loc[index, ['Country']].str.contains('United Kingdom of Great
Britain and Northern Ireland').bool():
energy.loc[index, ['Country']] = 'United Kingdom'
print(energy.loc[index, ['Country']])
if energy.loc[index, ['Country']].str.contains('China, Hong Kong
Special Administrative Region').bool():
energy.loc[index, ['Country']] = 'Hong Kong'
print(energy.loc[index, ['Country']])
您可以使用np.where
energy['Country'] = np.where(energy['Country'] == 'United States of America', 'United States', energy['Country'] )
energy['Country'] = np.where(energy['Country'] == 'Republic of Korea', 'Korea', energy['Country'])
要么:
energy['Country'][energy['Country'] == 'United States of America'] = 'United States'
energy['Country'][energy['Country'] == 'Republic of Korea'] = 'Korea'
DF:
Country
0 United States of America
1 Spain
2 Republic of Korea
3 France
输出:
Country
0 United States
1 Spain
2 Korea
3 France
您可以使用map
声明一个字典,然后使用map
例如:
import pandas as pd
mapVal = {'United States of America': 'United States', 'Republic of Korea': 'South Korea', 'United Kingdom of Great Britain and Northern Ireland': 'United Kingdom', 'China': 'Hong Kong', 'Hong Kong Special Administrative Region': 'Hong Kong'} #Sample Mapping
df = pd.DataFrame({'Country': ['United States of America', 'Republic of Korea', 'United Kingdom of Great Britain and Northern Ireland', 'China', 'Hong Kong Special Administrative Region']})
df["newVal"] = df["Country"].map(mapVal) #df["Country"] = df["Country"].map(mapVal)
print(df)
输出:
Country newVal
0 United States of America United States
1 Republic of Korea South Korea
2 United Kingdom of Great Britain and Northern I... United Kingdom
3 China Hong Kong
4 Hong Kong Special Administrative Region Hong Kong
您可以使用Pandas replace()
方法:
energy
Country
0 United States of America
1 Republic of Korea
2 United Kingdom of Great Britain and Northern I...
3 China, Hong Kong Special Administrative Region
energy.replace(rep_map)
Country
0 United States
1 South Korea
2 United Kingdom
3 Hong Kong
请注意, replace()
将替换数据帧中这些字符串的所有实例。
数据:
countries = ["United States of America",
"Republic of Korea",
"United Kingdom of Great Britain and Northern Ireland",
"China, Hong Kong Special Administrative Region"]
replacements = ["United States", "South Korea", "United Kingdom", "Hong Kong"]
rep_map = {k:v for k, v in zip(countries, replacements)}
energy = pd.DataFrame({"Country": countries})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.