ValueError : Length of values does not match length index

Question

import pandas as pd
dict1 = {'id_game': [112, 113, 114], 'game_name' : ['x','z','y'],'id_category':[1,2,3], 'id_players':[[588,589,590],[589],[588,589]]}
dict2 = {'id_player': [588, 589, 590],'player_name' : ['fff','aaa','ccc'] ,'indication':['mmm x ggg sdg y', 'uuu x fdb y kfnkjq z', 'fffre x']}
game_df = pd.DataFrame(dict1)
player_df = pd.DataFrame(dict2)

Here is my sample of the data that I have, I am looking to find a solution for getting a column contain categories_id in the second dataframe game_df based on relation between game_df['id_players'] and player_df['id_player'] or game_df['game_name'] and drug_df['indication']

In the following script i used game_name and indication values:

new_list = []
for i in range(len(game_df)):
    for j in range(len(player_df)):
        if game_df['game_name'][i] in player_df['indication'][j]:
            new_list.append(game_df['id_category'][i])
            print(new_list)
            
player_df['categories_id'] = new_list

ERROR:

--> 747         raise ValueError(
    748             "Length of values "
    749             f"({len(data)}) "

ValueError: Length of values (6) does not match length of index (3)

Answer 1

Your code can be fixed by adding break after print(new_list) , with the same indentation.

...
if game_df['game_name'][i] in player_df['indication'][j]:
    new_list.append(game_df['id_category'][i])
    print(new_list)
    break

That being said, iterating over dataframes is highly discouraged because it's slow and gets unwieldy very quickly. The canonical way to approach problems like this would be to merge the dataframes on the id_player(s) , ie, explode the ids in id_players into individual rows,

>>> game_df = game_df.explode("id_players").rename(columns={"id_players": "id_player"})
>>> game_df
   id_game game_name  id_category id_player
0      112         x            1       588
0      112         x            1       589
0      112         x            1       590
1      113         z            2       589
2      114         y            3       588
2      114         y            3       589

so you can .merge it with the game_df ,

>>> df = game_df.merge(player_df, on="id_player")
>>> df
   id_game game_name  id_category id_player player_name            indication
0      112         x            1       588         fff       mmm x ggg sdg y
1      114         y            3       588         fff       mmm x ggg sdg y
2      112         x            1       589         aaa  uuu x fdb y kfnkjq z
3      113         z            2       589         aaa  uuu x fdb y kfnkjq z
4      114         y            3       589         aaa  uuu x fdb y kfnkjq z
5      112         x            1       590         ccc               fffre x

That will make analyses rather straightforward, like checking if the game_name is in the indication becomes

df.apply(lambda row: row.game_name in row.indication, axis=1)

though it returns True for all of them, so I'm not sure if that's actually what you want.

ValueError : Length of values does not match length index

Question

1 answers

solution1
0 2022-08-04 14:30:25

ValueError : Length of values does not match length index

Question

1 answers

solution1 0 2022-08-04 14:30:25

solution1
0 2022-08-04 14:30:25