[英]Python: Create custom pd.dataframe class
我想为熊猫数据框放置一些标准任务,例如用数据初始化并将这些数据处理到一个类中。 我目前正在执行以下示例步骤:
import pandas as pd
import urllib.request
def __get_data():
URL = r'https://en.wikipedia.org/wiki/List_of_sovereign_states_' \
r'and_dependent_territories_by_continent_(data_file)#Data_file'
HTML_STRING = urllib.request.urlopen(URL)
return pd.read_html(HTML_STRING)[2]
def __prepare_data(df):
df.iloc[:,-1] = df.iloc[:,-1].str.upper()
return df
MyDataFrame = pd.DataFrame()
MyDataFrame = __get_data()
MyDataFrame = __prepare_data(MyDataFrame)
我想要这样的东西:
class MyDataFrame(pd.DataFrame):
def __init__(self, *args, **kwargs):
super(MyDataFrame, self).__init__(*args, **kwargs)
self = self.__get_data()
self.__prepare_data()
def __get_data(self):
URL = r'https://en.wikipedia.org/wiki/List_of_sovereign_states_' \
r'and_dependent_territories_by_continent_(data_file)#Data_file'
HTML_STRING = urllib.request.urlopen(URL)
return pd.read_html(HTML_STRING)[2]
def __prepare_data(self):
self.iloc[:, -1] = self.iloc[:, -1].str.upper()
不幸的是,我不了解这种情况下的Pandas 文档。
虽然我认为这是不明智的,但这种修改有效:
class MyDataFrame(pd.DataFrame):
def __init__(self, *args, **kwargs):
super(MyDataFrame, self).__init__(*args, **kwargs)
self.data = self.__get_data()
self.__prepare_data()
def __get_data(self):
URL = r'https://en.wikipedia.org/wiki/List_of_sovereign_states_' \
r'and_dependent_territories_by_continent_(data_file)#Data_file'
HTML_STRING = urllib.request.urlopen(URL)
return pd.read_html(HTML_STRING)[2]
def __prepare_data(self):
self.data.iloc[:, -1] = self.data.iloc[:, -1].str.upper()
d = MyDataFrame()
print(d.data)
输出:
CC a-2 a-3 # Name
0 AS AF AFG 4.0 AFGHANISTAN, ISLAMIC REPUBLIC OF
1 EU AL ALB 8.0 ALBANIA, REPUBLIC OF
2 AN AQ ATA 10.0 ANTARCTICA (THE TERRITORY SOUTH OF 60 DEG S)
3 AF DZ DZA 12.0 ALGERIA, PEOPLES DEMOCRATIC REPUBLIC OF
4 OC AS ASM 16.0 AMERICAN SAMOA
... ... ... ... ... ...
257 AF ZM ZMB 894.0 ZAMBIA, REPUBLIC OF
258 AS XD NaN NaN UNITED NATIONS NEUTRAL ZONE
259 AS XE NaN NaN IRAQ-SAUDI ARABIA NEUTRAL ZONE
260 AS XS NaN NaN SPRATLY ISLANDS
261 OC XX NaN NaN DISPUTED TERRITORY
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.