簡體   English   中英

將 numpy float64 稀疏矩陣轉換為 pandas 數據幀

[英]Convert a numpy float64 sparse matrix to a pandas data frame

我有一個nxn numpy float64 sparse matrixdata ,其中n = 44 ),其中行和列是圖形節點,值是邊權重:

>>> data
<44x44 sparse matrix of type '<class 'numpy.float64'>'
    with 668 stored elements in Compressed Sparse Row format>

>>> type(data)
<class 'scipy.sparse.csr.csr_matrix'>

>>> print(data)
  (0, 7)    0.11793236293516568
  (0, 9)    0.10992000939300195
  (0, 21)   0.7422196678913772
  (0, 23)   0.0630039712667936
  (0, 24)   0.027037442463504143
  (0, 27)   0.16908845414214152
  (0, 28)   0.6109227233402952
  (0, 32)   0.0514765253537568
  (0, 33)   0.016341754080557713
  (1, 6)    0.015070325434709386
  (1, 10)   9.346673769086203e-05
  (1, 11)   0.2471018034781923
  (1, 14)   0.0020684269551621776
  (1, 18)   0.015258704502643251
  (1, 20)   0.021798149289490358
  (1, 22)   0.0087026831764125
  (1, 24)   0.1454235884185166
  (1, 25)   0.022060777594183015
  (1, 29)   0.9117391202819067
  (1, 30)   0.018557883854566116
  (1, 31)   0.001876070225734826
  (1, 32)   0.025841354399637764
  (1, 33)   0.014766488228364438
  (1, 39)   0.002791226433410351
  (1, 43)   1.0
  : :
  (41, 7)   0.8922099840113696
  (41, 10)  0.015776226631920767
  (41, 12)  1.0
  (41, 15)  0.1839408706622038
  (41, 18)  0.5151025641025642
  (41, 20)  0.4599130036630037
  (41, 22)  0.29378473237788827
  (41, 33)  0.47474890700697153
  (41, 39)  1.0
  (42, 2)   1.0
  (42, 10)  0.023305789342610222
  (42, 11)  0.011349136164776494
  (42, 12)  1.0
  (42, 17)  0.886081346522542
  (42, 18)  1.0
  (42, 30)  1.0
  (42, 40)  1.0
  (43, 1)   1.0
  (43, 6)   1.0
  (43, 11)  0.039948959300013256
  (43, 13)  1.0
  (43, 14)  0.02669811947637717
  (43, 29)  1.0
  (43, 30)  1.0
  (43, 36)  0.3381986531986532

我想將其轉換為pandas data frame ,以便將其寫入文件,其中包含以下列: node1, node2, edge_weight ,因此將給出:

node1, node2, edge_weight
0, 7, 0.11793236293516568
0, 9, 0.10992000939300195
:, :, :
43, 36, 0.3381986531986532

知道怎么做嗎?

注意:

>>> pandas.DataFrame(data)

給出:

                                                    0
0     (0, 7)\t0.11793236293516568\n  (0, 9)\t0.109...
1     (0, 6)\t0.015070325434709386\n  (0, 10)\t9.3...

>>> pandas.DataFrame(print(data))

給出:

  (0, 7)    0.11793236293516568
  (0, 9)    0.10992000939300195

所以我猜pandas.DataFrame(print(data))接近我正在尋找的東西。

你可以嘗試toarray

pd.DataFrame(A.toarray())

這個 ipython 會話展示了一種你可以做到的方法。 兩個步驟是:將稀疏矩陣轉換為COO格式,然后使用COO矩陣的.row.col.data屬性創建Pandas DataFrame。

In [50]: data                                                                                                    
Out[50]: 
<15x15 sparse matrix of type '<class 'numpy.float64'>'
    with 11 stored elements in Compressed Sparse Row format>

In [51]: print(data)                                                                                             
  (1, 12)   0.8581958095588134
  (6, 12)   0.03828052946099181
  (6, 14)   0.7908634838351427
  (7, 1)    0.7995008873930302
  (7, 11)   0.48477191537121145
  (7, 13)   0.6226526443518743
  (9, 4)    0.37242576669669103
  (11, 1)   0.9604278557580955
  (11, 5)   0.13285436036287313
  (12, 11)  0.5631419223609928
  (13, 8)   0.16481624650723847

In [52]: import pandas as pd                                                                                     

In [53]: c = data.tocoo()                                                                                        

In [54]: df = pd.DataFrame({node1: c.row, node2: c.col, edge_weight: c.data})                                   

In [55]: df                                                                                                      
Out[55]: 
    node1  node2  edge_weight
0       1     12     0.858196
1       6     12     0.038281
2       6     14     0.790863
3       7      1     0.799501
4       7     11     0.484772
5       7     13     0.622653
6       9      4     0.372426
7      11      1     0.960428
8      11      5     0.132854
9      12     11     0.563142
10     13      8     0.164816

我在使用OneHotEncoder時遇到了類似的問題,我通過將 sparse 更改為 False 來修復它

enc = OneHotEncoder(sparse=False)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM