[英]Pandas dataframe scatter plot with 2-level Multiindex as axes
[英]Pandas: Dataframe to Panel error: NotImplementedError: Only 2-level MultiIndex are supported
我在將數據幀發送到to_Panel
時遇到麻煩。 我正在對數據進行初步操作,並擔心這些操作可能會導致問題。
merge.head()
date catcode type di cid feccandid amount disposition bills
0 2005-12-31 G1100 24K D N00004045 H2MI11042 1500 support 1
1 2005-12-31 L1100 24K D N00004045 H2MI11042 8000 support 1
2 2005-12-31 L1100 24K D N00004155 H2MI02066 1000 oppose 1
3 2005-12-31 T1200 24K D N00004166 H4MI03045 3000 support 1
然后我形成一個數據pivot_table
mm = merge.pivot_table(index=['date', 'feccandid', 'disposition', \
'bills', 'cid', 'di', 'type'], columns='catcode',values='amount', \
fill_value=0)
catcode A0000 A1000 A1100 A1200 A1300 A1400 A1500 A1600 A2000 A2300 ... T9100 T9400 X3700 X4000 X4100 X4110 X5000 X7000 Y0000 Z5200
date feccandid disposition bills cid di type
2005-12-31 H2MI02066 oppose 1 N00004155 D 24K 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H2MI11042 support 1 N00004045 D 24K 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H4MI03045 support 1 N00004166 D 24K 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
3 rows × 315 columns
然后,我重置索引:
mm = mm.reset_index()
mm.head()
catcode date feccandid disposition bills cid di type A0000 A1000 A1100 ... T9100 T9400 X3700 X4000 X4100 X4110 X5000 X7000 Y0000 Z5200
0 2005-12-31 H2MI02066 oppose 1 N00004155 D 24K 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
1 2005-12-31 H2MI11042 support 1 N00004045 D 24K 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
2 2005-12-31 H4MI03045 support 1 N00004166 D 24K 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
然后,我發送到csv:
mm.to_csv('i.test', index=False)
從csv中讀取:
hh = pd.read_csv('i.test')
設置索引:
hh.set_index(['date', 'feccandid']).head(3)
disposition bills cid di type A0000 A1000 A1100 A1200 A1300 ... T9100 T9400 X3700 X4000 X4100 X4110 X5000 X7000 Y0000 Z5200
date feccandid
2005-12-31 H2MI02066 oppose 1 N00004155 D 24K 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H2MI11042 support 1 N00004045 D 24K 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H4MI03045 support 1 N00004166 D 24K 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
面板:
hh.to_panel()
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-86-9358192e71a3> in <module>()
----> 1 hh.to_panel()
/home/jayaramdas/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in to_panel(self)
1210 if (not isinstance(self.index, MultiIndex) or # pragma: no cover
1211 len(self.index.levels) != 2):
-> 1212 raise NotImplementedError('Only 2-level MultiIndex are supported.')
1213
1214 if not self.index.is_unique:
NotImplementedError: Only 2-level MultiIndex are supported.
有任何想法,問題或批評嗎?
set_index
不在適當的位置,因此您的hh
沒有MultiIndex作為索引。
>>> hh.to_panel()
Traceback (most recent call last):
File "<ipython-input-4-9358192e71a3>", line 1, in <module>
hh.to_panel()
File "/home/dsm/sys/pys/3.5.1/lib/python3.5/site-packages/pandas/core/frame.py", line 1224, in to_panel
raise NotImplementedError('Only 2-level MultiIndex are supported.')
NotImplementedError: Only 2-level MultiIndex are supported.
>>> hh.set_index(["date", "feccandid"]).to_panel()
<class 'pandas.core.panel.Panel'>
Dimensions: 20 (items) x 1 (major_axis) x 3 (minor_axis)
Items axis: catcode to Z5200
Major_axis axis: 2005-12-31 to 2005-12-31
Minor_axis axis: H2MI02066 to H4MI03045
您可以在set_index
添加set_index
inplace=True
,但只是將hh = hh.set_index(...)
改為更可取。
xarray
:我認為現在逐漸不建議使用面板,而推薦使用功能更豐富的xarray
Nd對象,因此您可能要考慮安裝xarray
,然后再進行
>>> hh.to_xarray()
<xarray.Dataset>
Dimensions: (date: 1, feccandid: 3)
Coordinates:
* date (date) object '2005-12-31'
* feccandid (feccandid) object 'H2MI02066' 'H2MI11042' 'H4MI03045'
Data variables:
catcode (date, feccandid) int64 0 1 2
disposition (date, feccandid) object 'oppose' 'support' 'support'
bills (date, feccandid) int64 1 1 1
cid (date, feccandid) object 'N00004155' 'N00004045' 'N00004166'
di (date, feccandid) object 'D' 'D' 'D'
type (date, feccandid) object '24K' '24K' '24K'
[...]
而是嘗試這種方式。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.