[英]What is the fastest way to make a DataFrame from a list?
So basically im trying to transform a list into a DataFrame. 所以基本上我试图将列表转换为DataFrame。
Here are the two ways of doing it that I am trying, but I cannot come up to a good performance benchmark. 这是我正在尝试的两种实现方法,但是我无法达到良好的性能基准。
import pandas as pd
mylist = [1,2,3,4,5,6]
names = ["name","name","name","name","name","name"]
# Way 1
pd.DataFrame([mylist], columns=names)
# Way 2
pd.DataFrame.from_records([mylist], columns=names)
I also tried dask but I did not find anything that could work for me. 我也尝试过dask,但没有找到任何适合我的方法。
so I just made up an example with 10 columns and random integers in the range of 1 Million values in those, i got the maximum result very quickly. 因此,我只用10个列和其中1百万个值范围内的随机整数组成了一个示例,我很快就得到了最大的结果。 Does this give you maybe a start to work with dask?
这是否会让您开始使用dask? They proposed the approach here which is also related to this question .
他们在这里提出了与此问题有关的方法。
import dask.dataframe as dd
from dask.delayed import delayed
import pandas as pd
import numpy as np
# Create List with random integers
list_large = [np.random.random_sample(int(1e6))*i for i in range(10)]
# Convert it to dask dataframe
dfs = [delayed(pd.DataFrame)(i) for i in list_large]
df = dd.from_delayed(dfs)
# Calculate Maximum
max = df.max().compute()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.