简体   繁体   中英

numpy different data type of multidimensional array

I want to insert 2d data into numpy and project as data frame in pandas . Basically it has 10 rows and 5 column . Data type of all 5 columns are order ( int , string , int , string, string) .

_MaxRows = 10

_MaxColumns = 5


person = np.zeros([_MaxRows,5])


person

def personGen():
    for i in range(0,_MaxRows): 
            # add person dynamically
        # person[i][0] = i
        # person[i][1] = names[i]
        # person[i][2] = random.randint(10,50)
        # person[i][3] = random.choice(['M','F'])
        # person[i][4] = 'Desc'


personGen()

OUTPUT REQUIRED AS DATA FRAME

Id Name Age Gender Desc
1  Sumeet 12 'M' 'HELLO'
2  Sumeet2 13 'M' 'HELLO2'

You cannot have different data types in the same numpy array. You could instead have a list of linear numpy arrays with each having its own data type.

eg like this:

names = ["asd", "shd", "wdf"]
ages = np.array([12, 35, 23])

d = {'name': names, 'age': ages}
df = pd.DataFrame(data=d)

Building from yar's answer:

You cannot have different data types in the same numpy (uni-multi)dimensional array. You can have different data types in numpy's structured array.

_MaxRows = 2
_MaxNameSize = 7
_MaxDescSize = 4

names = ['Sumeet', 'Sumeet2']

data = list(
    [(i, names[i], random.randint(10,50), random.choice(['M', 'F']), 'Desc') for i in range(0,_MaxRows)]
)

nparray = numpy.array(data, dtype=[('id', int), ('name', f'S{_MaxNameSize}'), ('age', int), ('Gender', 'S1'), ('Desc', f'S{_MaxDescSize}')])

# if you cannot specify string types size you can do like below, however I've read it decreases numpys operations performance because it removes the benefit of the array's data contiguous memory usage.
# nparray = numpy.array(data, dtype=[('id', int), ('name', object), ('age', int), ('Gender', 'S1'), ('Desc', object)])

print('Numpy structured array:\n', nparray)

pddataframe = pd.DataFrame(nparray)

print('\nPandas dataframe:\n', pddataframe)

Panda's dataframe by default already creates an index (0, 1, ..) for your incoming data. So you may choose to not inform the 'id' column.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM