简体   繁体   中英

Must have equal len keys and value when setting with an iterable

I have two dataframes as follows:

leader:
    0 11
    1 8
    2 5
    3 9
    4 8
    5 6
    [6065 rows x 2 columns]

DatasetLabel:    
    Unnamed: 0      0    1  ....    7     8    9  10  11  12  
    0               A    J  ....    1     2    5 NaN NaN NaN  
    1               B    K  ....    3     4   NaN  NaN NaN NaN  

    [4095 rows x 14 columns]

The Information dataset column names 0 to 6 are DatasetLabel about data and 7 to 12 are indexes that refer to the first column of leader Dataframe.

I want to create dataset where instead of the indexes in DatasetLabel Dataset I have the value of each index from the leader dataset, which is leader.iloc[index,1]

How can I do it using python features?

The output should look like:

 DatasetLabel:    
        Unnamed: 0      0    1  ....    7     8    9  10  11  12  
        0               A    J  ....    8     5    6 NaN NaN NaN  
        1               B    K  ....    9     8   NaN  NaN NaN NaN  

I have come up with the following, but I get an error:

    for column in DatasetLabel.ix[:,8:13]:
        DatasetLabel[DatasetLabel[column].notnull ()]=leader.iloc[DatasetLabel[DatasetLabel[column].notnull ()][column].values,1]

Error:

ValueError: Must have equal len keys and value when setting with an iterable

You can use apply to index into leader and exchange values with DatasetLabel , although it's not very pretty.

One issue is that Pandas won't let us index with NaN . Converting to str provides a workaround. But that creates a second issue, namely, column 9 is of type float (because NaN is float ), so 5 becomes 5.0 . Once it's a string, that's "5.0" , which will fail to match the index values in leader . We can remove the .0 , and then this solution will work - but it's a bit of a hack.

With DatasetLabel as:

   Unnamed:0  0  1  7  8    9  10  11  12
0          0  A  J  1  2  5.0 NaN NaN NaN
1          1  B  K  3  4  NaN NaN NaN NaN

And leader as:

   0   1
0  0  11
1  1   8
2  2   5
3  3   9
4  4   8
5  5   6

Then:

cols = ["7","8","9","10","11","12"]
updated = DatasetLabel[cols].apply(
    lambda x: leader.loc[x.astype(str).str.split(".").str[0], 1].values, axis=1)

updated
     7    8    9  10  11  12
0  8.0  5.0  6.0 NaN NaN NaN
1  9.0  8.0  NaN NaN NaN NaN

Now we can concat the unmodified columns (which we'll call original ) with updated :

original_cols = DatasetLabel.columns[~DatasetLabel.columns.isin(cols)]
original = DatasetLabel[original_cols]
pd.concat([original, updated], axis=1)

Output:

   Unnamed:0  0  1    7    8    9  10  11  12
0          0  A  J  8.0  5.0  6.0 NaN NaN NaN
1          1  B  K  9.0  8.0  NaN NaN NaN NaN

Note: It may be clearer to use concat here, but here's another, cleaner way of merging original and updated , using assign :

DatasetLabel.assign(**updated)

There's a special action: When I run a python script in the following environment, error: (stderr) ValueError: Must have equal len keys and value when setting with an iterable occurred.

OS: Linux Redhat 8.2

Python version 3.8.5

I changed 2 things below and resolved the problem :

  1. check the Linux folder and python source code files access permission, not directly using root permission when you are login by a normal user. you can use chown and chgrp to change them.
  2. change the pandas version from a high version to low version, After I uninstall the newest high pandas package, then I install the 1.1.3 version pandas package, the issue was fixed. you can use the command below:
# uninstall pandas package
pip uninstall pandas

# re-install the pandas package
pip install pandas==1.1.3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM