简体   繁体   中英

Convert a NumPy to_records() array into a Pandas Dataframe

I have the following sample data:

rec.array([('FY20',  361.410592  ,  nan, 21.97, nan, 'Total', 'Fast'),
       ('FY21',  359.26952604,  -1., 22.99,  5., 'Total', 'Fast'),
       ('FY22',  362.4560529 ,   1., 22.77, -1., 'Total', 'Fast'),
       ('FY23',  371.53543252,   2., 21.92, -4., 'Total', 'Fast'),
       ('FY24',  374.48894494,   1., 21.88, -0., 'Total', 'Fast'),
       ('FY25',  377.09481613,   1., 21.85, -0., 'Total', 'Fast'),
       ('FY20',   67.043756  ,  nan, 21.  , nan, 'Homes', 'Fast'),
       ('FY21',  110.12145222,  63., 20.95, -0., 'Homes', 'Fast'),
       ('FY22',  117.46526727,   7., 20.73, -1., 'Homes', 'Fast'),
       ('FY23',  125.83482531,   7., 18.99, -8., 'Homes', 'Fast'),
       ('FY24',  126.16748411,   1., 18.95, -0., 'Homes', 'Fast'),
       ('FY25',  127.786528  ,   1., 18.96,  0., 'Homes', 'Fast'),
       ('FY20',  294.366836  ,  nan, 22.19, nan, 'Businesses', 'Fast'),
       ('FY21',  249.14807381, -15., 23.89,  8., 'Businesses', 'Fast'),
       ('FY22',  245.99078563,  -2., 23.74, -1., 'Businesses', 'Fast'),
       ('FY23',  245.70060721,   0., 23.42, -1., 'Businesses', 'Fast'),
       ('FY24',  247.32146083,   1., 23.37, -0., 'Businesses', 'Fast'),
       ('FY25',  250.30828813,   1., 23.33, -0., 'Businesses', 'Fast'),
       ('FY20',  184.631684  ,  nan, 15.47, nan, 'Total', 'Medium'),
       ('FY21',  274.25718084,  49., 15.53,  0., 'Total', 'Medium'),
       ('FY22',  333.23835913,  21., 15.33, -1., 'Total', 'Medium'),
       ('FY23',  357.33167549,   7., 15.52,  1., 'Total', 'Medium'),
       ('FY24',  367.84796426,   3., 15.53,  0., 'Total', 'Medium'),
       ('FY25',  370.1664439 ,   1., 15.53,  0., 'Total', 'Medium'),
       ('FY20',   46.522416  ,  nan, 17.89, nan, 'Homes', 'Medium'),
       ('FY21',   97.63428522, 112., 18.72,  5., 'Homes', 'Medium'),
       ('FY22',  141.25547499,  46., 17.86, -5., 'Homes', 'Medium'),
       ('FY23',  157.06766598,  11., 18.33,  3., 'Homes', 'Medium'),
       ('FY24',  163.02337094,   4., 18.29, -0., 'Homes', 'Medium'),
       ('FY25',  165.98360465,   1., 18.28, -0., 'Homes', 'Medium'),
       ('FY20',  138.109268  ,  nan, 14.66, nan, 'Businesses', 'Medium'),
       ('FY21',  177.62289562,  28., 13.77, -6., 'Businesses', 'Medium'),
       ('FY22',  191.98288414,   8., 13.46, -2., 'Businesses', 'Medium'),
       ('FY23',  200.26400951,   4., 13.31, -1., 'Businesses', 'Medium'),
       ('FY24',  203.82459332,   2., 13.31,  0., 'Businesses', 'Medium'),
       ('FY25',  205.18283926,   1., 13.31,  0., 'Businesses', 'Medium')],
      dtype=[('FY', 'O'), ('ADV', '<f8'), ('YoY_ADV', '<f8'), ('Yield', '<f8'), ('YoY_Yld', '<f8'), ('Cut', 'O'), ('Product', 'O')])

I am having a hard time finding examples of where a dataframe was converted using to_records() back to a dataframe again. How do I convert this data into a dataframe?

Make sure rec and nan are imported from numpy :

from numpy import rec, nan
a = rec.array([('FY20', 361.410592, nan, 21.97, nan, 'Total', 'Fast'), ('FY21', 359.26952604, -1., 22.99, 5., 'Total', 'Fast'), ('FY22', 362.4560529, 1., 22.77, -1., 'Total', 'Fast'), ('FY23', 371.53543252, 2., 21.92, -4., 'Total', 'Fast'), ('FY24', 374.48894494, 1., 21.88, -0., 'Total', 'Fast'), ('FY25', 377.09481613, 1., 21.85, -0., 'Total', 'Fast'), ('FY20', 67.043756, nan, 21., nan, 'Homes', 'Fast'), ('FY21', 110.12145222, 63., 20.95, -0., 'Homes', 'Fast'), ('FY22', 117.46526727, 7., 20.73, -1., 'Homes', 'Fast'), ('FY23', 125.83482531, 7., 18.99, -8., 'Homes', 'Fast'), ('FY24', 126.16748411, 1., 18.95, -0., 'Homes', 'Fast'), ('FY25', 127.786528, 1., 18.96, 0., 'Homes', 'Fast'), ('FY20', 294.366836, nan, 22.19, nan, 'Businesses', 'Fast'), ('FY21', 249.14807381, -15., 23.89, 8., 'Businesses', 'Fast'), ('FY22', 245.99078563, -2., 23.74, -1., 'Businesses', 'Fast'), ('FY23', 245.70060721, 0., 23.42, -1., 'Businesses', 'Fast'), ('FY24', 247.32146083, 1., 23.37, -0., 'Businesses', 'Fast'), ('FY25', 250.30828813, 1., 23.33, -0., 'Businesses', 'Fast'), ('FY20', 184.631684, nan, 15.47, nan, 'Total', 'Medium'), ('FY21', 274.25718084, 49., 15.53, 0., 'Total', 'Medium'), ('FY22', 333.23835913, 21., 15.33, -1., 'Total', 'Medium'), ('FY23', 357.33167549, 7., 15.52, 1., 'Total', 'Medium'), ('FY24', 367.84796426, 3., 15.53, 0., 'Total', 'Medium'), ('FY25', 370.1664439, 1., 15.53, 0., 'Total', 'Medium'), ('FY20', 46.522416, nan, 17.89, nan, 'Homes', 'Medium'), ('FY21', 97.63428522, 112., 18.72, 5., 'Homes', 'Medium'), ('FY22', 141.25547499, 46., 17.86, -5., 'Homes', 'Medium'), ('FY23', 157.06766598, 11., 18.33, 3., 'Homes', 'Medium'), ('FY24', 163.02337094, 4., 18.29, -0., 'Homes', 'Medium'), ('FY25', 165.98360465, 1., 18.28, -0., 'Homes', 'Medium'), ('FY20', 138.109268, nan, 14.66, nan, 'Businesses', 'Medium'), ('FY21', 177.62289562, 28., 13.77, -6., 'Businesses', 'Medium'), ('FY22', 191.98288414, 8., 13.46, -2., 'Businesses', 'Medium'), ('FY23', 200.26400951, 4., 13.31, -1., 'Businesses', 'Medium'), ('FY24', 203.82459332, 2., 13.31, 0., 'Businesses', 'Medium'), ('FY25', 205.18283926, 1., 13.31, 0., 'Businesses', 'Medium')], dtype=[('FY', 'O'), ('ADV', '<f8'), ('YoY_ADV', '<f8'), ('Yield', '<f8'), ('YoY_Yld', '<f8'), ('Cut', 'O'), ('Product', 'O')])

Then just pass it to the pd.DataFrame constructor:

import pandas as pd
df = pd.DataFrame(a)

#       FY         ADV  YoY_ADV  Yield  YoY_Yld         Cut Product
# 0   FY20  361.410592      NaN  21.97      NaN       Total    Fast
# 1   FY21  359.269526     -1.0  22.99      5.0       Total    Fast
# 2   FY22  362.456053      1.0  22.77     -1.0       Total    Fast
# 3   FY23  371.535433      2.0  21.92     -4.0       Total    Fast
# 4   FY24  374.488945      1.0  21.88     -0.0       Total    Fast
# 5   FY25  377.094816      1.0  21.85     -0.0       Total    Fast
# 6   FY20   67.043756      NaN  21.00      NaN       Homes    Fast
# 7   FY21  110.121452     63.0  20.95     -0.0       Homes    Fast
# 8   FY22  117.465267      7.0  20.73     -1.0       Homes    Fast
# 9   FY23  125.834825      7.0  18.99     -8.0       Homes    Fast
# 10  FY24  126.167484      1.0  18.95     -0.0       Homes    Fast
# 11  FY25  127.786528      1.0  18.96      0.0       Homes    Fast
# 12  FY20  294.366836      NaN  22.19      NaN  Businesses    Fast
# 13  FY21  249.148074    -15.0  23.89      8.0  Businesses    Fast
# 14  FY22  245.990786     -2.0  23.74     -1.0  Businesses    Fast
# 15  FY23  245.700607      0.0  23.42     -1.0  Businesses    Fast
# 16  FY24  247.321461      1.0  23.37     -0.0  Businesses    Fast
# 17  FY25  250.308288      1.0  23.33     -0.0  Businesses    Fast
# 18  FY20  184.631684      NaN  15.47      NaN       Total  Medium
# 19  FY21  274.257181     49.0  15.53      0.0       Total  Medium
# 20  FY22  333.238359     21.0  15.33     -1.0       Total  Medium
# 21  FY23  357.331675      7.0  15.52      1.0       Total  Medium
# 22  FY24  367.847964      3.0  15.53      0.0       Total  Medium
# 23  FY25  370.166444      1.0  15.53      0.0       Total  Medium
# 24  FY20   46.522416      NaN  17.89      NaN       Homes  Medium
# 25  FY21   97.634285    112.0  18.72      5.0       Homes  Medium
# 26  FY22  141.255475     46.0  17.86     -5.0       Homes  Medium
# 27  FY23  157.067666     11.0  18.33      3.0       Homes  Medium
# 28  FY24  163.023371      4.0  18.29     -0.0       Homes  Medium
# 29  FY25  165.983605      1.0  18.28     -0.0       Homes  Medium
# 30  FY20  138.109268      NaN  14.66      NaN  Businesses  Medium
# 31  FY21  177.622896     28.0  13.77     -6.0  Businesses  Medium
# 32  FY22  191.982884      8.0  13.46     -2.0  Businesses  Medium
# 33  FY23  200.264010      4.0  13.31     -1.0  Businesses  Medium
# 34  FY24  203.824593      2.0  13.31      0.0  Businesses  Medium
# 35  FY25  205.182839      1.0  13.31      0.0  Businesses  Medium

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM