我写了这段代码并收到了这个错误。我应该如何解决这个问题？

Question

import pandas as pd
import numpy as np
import sklearn as preprocessing


country ={'data source':['data','country name','brazil','switzerland','germany','denmark','spain','france','japan','greece','iran','kuwait','morocco','nigeria','qatar','sweden','india','world'],
'unnamed1':['nan','country code','BRA','CHE','DEU','DNK','ESP','FRA','JPN','GRC','IRN','KWT','MAR','NGA','QAT','SWE','IND','WLD'],
'unnamed2':[2016,'population growth',0.817555711,1.077221168,1.193866758,0.834637611,-0.008048086,0.407491036,-0.115284177,-0.687542545,1.1487886,2.924206194,'nan',1.148214693,1.18167997],
'unnamed3':['nan','total population',207652865,8372098,82667685,'nan',46443959,66896109,126994511,10746740,80277428,4052584,35276786,185989640,2569804,9903122,1324171354,7442135578],
'unnamed4':['area(sq.km)',8358140,39516,348900,42262,500210,547557,394560,128900,16287601,'nan',446300,910770,11610,407310,2973190,129733172.7]}
my_df = pd.DataFrame(country, index=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17], columns=['data source','unnamed1','unnamed2','unnamed3','unnamed4'])
print(my_df)

and this is the error:这是错误：

Traceback (most recent call last):
  File "c:/Users/se7en/Desktop/AI/skl.py", line 11, in <module>
    my_df = pd.DataFrame(country, index=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17], columns=['data source','unnamed1','unnamed2','unnamed3','unnamed4'])
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\frame.py", line 614, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\internals\construction.py", line 465, in dict_to_mgr
    arrays, data_names, index, columns, dtype=dtype, typ=typ, consolidate=copy
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\internals\construction.py", line 136, in arrays_to_mgr
    arrays, arr_names, axes, consolidate=consolidate
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\internals\managers.py", line 1776, in create_block_manager_from_arrays
    raise construction_error(len(arrays), arrays[0].shape, axes, e)
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\internals\managers.py", line 1773, in create_block_manager_from_arrays
    blocks = _form_blocks(arrays, names, axes, consolidate)
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\internals\managers.py", line 1863, in _form_blocks
    items_dict["ObjectBlock"], np.object_, consolidate=consolidate
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\internals\managers.py", line 1903, in _simple_blockify
    values, placement = _stack_arrays(tuples, dtype)
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\internals\managers.py", line 1959, in _stack_arrays
    stacked[i] = arr
ValueError: could not broadcast input array from shape (15,) into shape (18,)

Answer 1

All the lists/arrays in dictionary must have the same length for the DataFrame constructor to accept the input.字典中的所有列表/数组必须具有相同的长度，以便 DataFrame 构造函数接受输入。

This is not the case with your data:您的数据不是这种情况：

{k:len(v) for k,v in country.items()}

output: output：

{'data source': 18,
 'unnamed1': 18,
 'unnamed2': 15,
 'unnamed3': 18,
 'unnamed4': 17}

Either trim the elements to the min length, or pad the shortest ones to the max length.要么将元素修剪到最小长度，要么将最短的元素填充到最大长度。

Another option to circumvent this might be to use a dictionary of Series, which will do the padding job automatically:另一种避免这种情况的方法可能是使用 Series 的字典，它将自动完成填充工作：

df = pd.DataFrame({k:pd.Series(v) for k,v in country.items()})

output: output：

     data source      unnamed1           unnamed2          unnamed3     unnamed4
0           data           nan               2016               nan  area(sq.km)
1   country name  country code  population growth  total population      8358140
2         brazil           BRA           0.817556         207652865        39516
3    switzerland           CHE           1.077221           8372098       348900
4        germany           DEU           1.193867          82667685        42262
5        denmark           DNK           0.834638               nan       500210
6          spain           ESP          -0.008048          46443959       547557
7         france           FRA           0.407491          66896109       394560
8          japan           JPN          -0.115284         126994511       128900
9         greece           GRC          -0.687543          10746740     16287601
10          iran           IRN           1.148789          80277428          nan
11        kuwait           KWT           2.924206           4052584       446300
12       morocco           MAR                nan          35276786       910770
13       nigeria           NGA           1.148215         185989640        11610
14         qatar           QAT            1.18168           2569804       407310
15        sweden           SWE                NaN           9903122      2973190
16         india           IND                NaN        1324171354  129733172.7
17         world           WLD                NaN        7442135578          NaN

NB.注意。 you should clarify the output you expect as it seems here that your lists are mixing labels and data您应该澄清您期望的 output，因为在这里您的列表似乎混合了标签和数据

我写了这段代码并收到了这个错误。我应该如何解决这个问题？

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-03-28 08:56:34

我写了这段代码并收到了这个错误。 我应该如何解决这个问题？

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-03-28 08:56:34

我写了这段代码并收到了这个错误。我应该如何解决这个问题？

解决方案1
1 已采纳 2022-03-28 08:56:34