简体   繁体   English

如何使用 python 数据表库从值矩阵(列表列表)和特征列表创建数据表 dataframe

[英]How to create datatable dataframe from a matrix of values (list of lists) and a list of features, using python datatable lib

Given a list of n features:给定一个包含n 个特征的列表:

lf = ['f1','f2',...,'fn']

Given a list of m lists, each nested list contain n value (a matrix of m rows and n columns):给定一个包含m个列表的列表,每个嵌套列表包含n 个值( m行和n列的矩阵):

matrix =  
[
[r0_v1, r0_v2, ..., r0_vn]  
[r1_v1, r1_v2, ..., r1_vn]  
.
.
.
[rm_v1, rm_v2, ..., rm_vn]  
]

What is the correct way to create a datatable datafarme using python datatable library ?使用python 数据表库创建数据表数据农场的正确方法是什么?

I tried something similar to pandas dataframe in the following source code:我在以下源代码中尝试了类似于 pandas dataframe 的东西:

import pandas as pd
import datatable as dt

# pandas create dataframe
pd_df = pd.DataFrame(matrix,columns=lf) # work fine

# datatable create dataframe
dt_df = dt.Frame(matrix,names=lf) # get error, the rows are considered as columns

But i get an error ValueError: The names argument contains n elements, which is more than the number of columns being created (m)但是我收到一个错误 ValueError: names 参数包含 n 个元素,这比正在创建的列数 (m) 多
Which means that the rows are considered as columns.这意味着行被视为列。

Thanks for your help.谢谢你的帮助。

To create datatable datafarme from matrix and list of features use dt.Frame(matrix_values, names=list_features)要从矩阵和特征列表创建数据表数据农场,请使用dt.Frame(matrix_values, names=list_features)
Use np.array to convert from list of lists to 2d array: matrix = np.array(matrix)使用np.array从列表列表转换为二维数组: matrix = np.array(matrix)

import datatable as dt
import numpy as np

lf = ['f1','f2','f3','f4','f5']

matrix = [
    [0,0,0,0,0],
    [1,1,1,1,1],
    [2,2,2,2,2],
]

matrix = np.array(matrix)

dt_df = dt.Frame(matrix,names=lf)

print(dt_df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM