[英]Create a DataFrame from list in lists (Pandas)
I´m having trouble creating a dataframe on my list.我无法在我的列表中创建 dataframe。
The list contains four columns, but instead it says on presente one column with data:该列表包含四列,但它在呈现一列数据时显示:
ValueError: 4 columns passed, passed data had 1 columns.
The list itself is presented in this way:列表本身以这种方式呈现:
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]
I know there is something happening due to the double [], but i can´t figure it out.我知道由于双 [] 而发生了一些事情,但我无法弄清楚。 Can´t someone help me?
有人不能帮助我吗?
Here is the code so far:这是到目前为止的代码:
for i in range(6):
excel_file = pd.read_excel(input_file, sheet_name=sheet[i])
excel_file = excel_file.values.tolist()
filtered = [x for x in excel_file if 'TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)' in x
or 'TOTAL DAS DESPESAS DE CUSTEIO (A)' in x
]
sheet_file = sheet[i]
sheet_variable.append(sheet_file)
wb_name.append(file_name)
conab_data.append(filtered)
print(filtered)
df_conab = pd.DataFrame(conab_data, columns=['Descrição', 'Preço/ha', 'Scs/ha', 'Part. %'])
df_conab['Local/UF/Ano'] = sheet_variable
df_conab['Fonte'] = wb_name
print(df_conab)
you could fix this with a for loop你可以用 for 循环解决这个问题
overly_nested = [[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]]
for i, sub_list in enumerate(overly_nested):
overly_nested[i]=sub_list[0]
df = pd.DataFrame(overly_nested)
print(df)
I'm sure theres a way to do this with zip()
, let me experiment and I'll edit if I find it我确定有一种方法可以使用
zip()
来做到这一点,让我尝试一下,如果我找到它,我会进行编辑
You can try:你可以试试:
data = [
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]
]
df = pd.DataFrame([x[0] for x in data], columns=['A', 'B', 'C', 'D'])
print(df)
Output: Output:
A B C D
0 TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A) 559.64 8.01 0.552077
1 TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A) 520.34 7.44 0.539386
2 TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A) 556.72 7.96 0.541083
3 TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A) 688.67 9.84 0.584535
4 TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A) 625.30 8.94 0.561295
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.