I have an nxn
numpy
float64
sparse matrix
( data
, where n = 44
), where the rows and columns are graph nodes and the values are edge weights:
>>> data
<44x44 sparse matrix of type '<class 'numpy.float64'>'
with 668 stored elements in Compressed Sparse Row format>
>>> type(data)
<class 'scipy.sparse.csr.csr_matrix'>
>>> print(data)
(0, 7) 0.11793236293516568
(0, 9) 0.10992000939300195
(0, 21) 0.7422196678913772
(0, 23) 0.0630039712667936
(0, 24) 0.027037442463504143
(0, 27) 0.16908845414214152
(0, 28) 0.6109227233402952
(0, 32) 0.0514765253537568
(0, 33) 0.016341754080557713
(1, 6) 0.015070325434709386
(1, 10) 9.346673769086203e-05
(1, 11) 0.2471018034781923
(1, 14) 0.0020684269551621776
(1, 18) 0.015258704502643251
(1, 20) 0.021798149289490358
(1, 22) 0.0087026831764125
(1, 24) 0.1454235884185166
(1, 25) 0.022060777594183015
(1, 29) 0.9117391202819067
(1, 30) 0.018557883854566116
(1, 31) 0.001876070225734826
(1, 32) 0.025841354399637764
(1, 33) 0.014766488228364438
(1, 39) 0.002791226433410351
(1, 43) 1.0
: :
(41, 7) 0.8922099840113696
(41, 10) 0.015776226631920767
(41, 12) 1.0
(41, 15) 0.1839408706622038
(41, 18) 0.5151025641025642
(41, 20) 0.4599130036630037
(41, 22) 0.29378473237788827
(41, 33) 0.47474890700697153
(41, 39) 1.0
(42, 2) 1.0
(42, 10) 0.023305789342610222
(42, 11) 0.011349136164776494
(42, 12) 1.0
(42, 17) 0.886081346522542
(42, 18) 1.0
(42, 30) 1.0
(42, 40) 1.0
(43, 1) 1.0
(43, 6) 1.0
(43, 11) 0.039948959300013256
(43, 13) 1.0
(43, 14) 0.02669811947637717
(43, 29) 1.0
(43, 30) 1.0
(43, 36) 0.3381986531986532
I'd like to convert it to a pandas
data frame
, in order to write it to a file, with the columns: node1, node2, edge_weight
, which will therefore give:
node1, node2, edge_weight
0, 7, 0.11793236293516568
0, 9, 0.10992000939300195
:, :, :
43, 36, 0.3381986531986532
Any idea how to do that?
Note that:
>>> pandas.DataFrame(data)
gives:
0
0 (0, 7)\t0.11793236293516568\n (0, 9)\t0.109...
1 (0, 6)\t0.015070325434709386\n (0, 10)\t9.3...
And
>>> pandas.DataFrame(print(data))
Gives:
(0, 7) 0.11793236293516568
(0, 9) 0.10992000939300195
So I guess pandas.DataFrame(print(data))
is close to what I'm looking for.
你可以尝试toarray
pd.DataFrame(A.toarray())
This ipython session shows one way you could do it. The two steps are: convert the sparse matrix to COO format, and then create the Pandas DataFrame using the .row
, .col
and .data
attributes of the COO matrix.
In [50]: data
Out[50]:
<15x15 sparse matrix of type '<class 'numpy.float64'>'
with 11 stored elements in Compressed Sparse Row format>
In [51]: print(data)
(1, 12) 0.8581958095588134
(6, 12) 0.03828052946099181
(6, 14) 0.7908634838351427
(7, 1) 0.7995008873930302
(7, 11) 0.48477191537121145
(7, 13) 0.6226526443518743
(9, 4) 0.37242576669669103
(11, 1) 0.9604278557580955
(11, 5) 0.13285436036287313
(12, 11) 0.5631419223609928
(13, 8) 0.16481624650723847
In [52]: import pandas as pd
In [53]: c = data.tocoo()
In [54]: df = pd.DataFrame({node1: c.row, node2: c.col, edge_weight: c.data})
In [55]: df
Out[55]:
node1 node2 edge_weight
0 1 12 0.858196
1 6 12 0.038281
2 6 14 0.790863
3 7 1 0.799501
4 7 11 0.484772
5 7 13 0.622653
6 9 4 0.372426
7 11 1 0.960428
8 11 5 0.132854
9 12 11 0.563142
10 13 8 0.164816
I ran into a similar problem when using OneHotEncoder
I fixed it by changing sparse to False
enc = OneHotEncoder(sparse=False)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.