![](/img/trans.png)
[英]Create a Pandas dataframe from unequal length lengths by repeating one value
[英]Create a pandas dataframe from a nested lists of unequal lengths
所以我有一個如下列表:
aa = ['aa1', 'aa2', 'aa3', 'aa4', 'aa5']
bb = ['bb1', 'bb2', 'bb3', 'bb4']
cc = ['cc1', 'cc2', 'cc3']
然后將其創建為嵌套列表:
nest = [aa, bb, cc]
我想創建一個數據幀,如下所示:
aa bb cc
aa1 bb1 cc1
aa2 bb2 cc2
aa3 bb3 cc3
aa4 bb4 nan
aa5 nan nan
我試過了:
pd.DataFrame(nest, columns=['aa', 'bb', cc'])
但結果就是這樣,每個列表都被寫成一行(而不是列)
itertools
的zip_longest
函數執行此操作:
>>> import itertools, pandas
>>> pandas.DataFrame((_ for _ in itertools.zip_longest(*nest)), columns=['aa', 'bb', 'cc'])
aa bb cc
0 aa1 bb1 cc1
1 aa2 bb2 cc2
2 aa3 bb3 cc3
3 aa4 bb4 None
4 aa5 None None
如果您有舊版本的pandas,則可能需要在列表構造函數中包裝zip_longest
。 在較舊的Python上,您可能需要調用izip_longest
而不是zip_longest
。
選項1
pd.DataFrame(nest, ['aa', 'bb', 'cc']).T
aa bb cc
0 aa1 bb1 cc1
1 aa2 bb2 cc2
2 aa3 bb3 cc3
3 aa4 bb4 None
4 aa5 None None
選項2
自制的zip_longest
f = lambda x, n: x[n] if n < len(x) else None
n, m = max(map(len, nest)), len(nest)
pd.DataFrame(
[[f(j, i) for j in nest] for i in range(n)],
columns=['aa', 'bb', 'cc']
)
aa bb cc
0 aa1 bb1 cc1
1 aa2 bb2 cc2
2 aa3 bb3 cc3
3 aa4 bb4 None
4 aa5 None None
或者可能
pd.DataFrame(data={'value':nest},index=['aa', 'bb', 'cc']).value.apply(pd.Series).T
Out[1297]:
aa bb cc
0 aa1 bb1 cc1
1 aa2 bb2 cc2
2 aa3 bb3 cc3
3 aa4 bb4 NaN
4 aa5 NaN NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.