简体   繁体   中英

How do I convert 2 column array(randomly generated) to a DataFrame?

Using a numpy random number generator, generate arrays on height and weight of the 88,000 people living in Utah. The average height is 1.75 metres and the average weight is 70kg. Assume standard deviation on 3. Combine these two arrays using column_stack method and convert it into a pandas DataFrame with the first column named as 'height' and the second column named as 'weight'

I've gotten the randomly generated data. However, I can't seem to convert the array to a DataFrame

import numpy as np
import pandas as pd

height = np.round(np.random.normal(1.75, 3, 88000), 2)
weight = np.round(np.random.normal(70, 3, 88000), 2)
np_height = np.array(height)
np_weight = np.array(weight)

Utah = np.round(np.column_stack((np_height, np_weight)), 2)
print(Utah)
df = pd.DataFrame(
        [[np_height],
         [np_weight]],
         index = [0, 1],
         columns = ['height', 'weight'])
print(df)

You want 2 columns, yet you passed data [[np_height],[np_weight]] as 1 column. You can set the data as dict .

df = pd.DataFrame({'height':np_height,
         'weight':np_weight},
         columns = ['height', 'weight'])
print(df)

The data in Utah is already in a suitable shape. Why not use that?

import numpy as np
import pandas as pd

height = np.round(np.random.normal(1.75, 3, 88000), 2)
weight = np.round(np.random.normal(70, 3, 88000), 2)
np_height = np.array(height)
np_weight = np.array(weight)

Utah = np.round(np.column_stack((np_height, np_weight)), 2)

df = pd.DataFrame(
         data=Utah,
         columns=['height', 'weight']
)
print(df.head())
   height  weight
0    3.57   65.32
1   -0.15   66.22
2    5.65   73.11
3    2.00   69.59
4    2.67   64.95

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM