My code is running properly in pandas, but not in modin

Question

when i use pandas, the code works perfect ( but very slow ), and when use modin, and concat dataframe, shows me an aerror

contador = 0
df = pd.DataFrame()
data = pd.DataFrame()

for file in range(len(files)):
    usefile = files[file]
    print("Valor Numero :" + str(contador) + " de un total de " + str((len(files))) + " archivos")
    print("Existe " + str(usefile) + " añadiendolo al DataFrame" )
    contador = contador +1
    ruta = mainpath + "/" + str(usefile) 
    df = pd.read_csv(ruta)
    datos[usefile] = df
data = pd.concat(datos.values(), keys=datos.keys() , sort='True')

I expect the output of a dataframe with all files concatenate from dict, but y recive ( in pandas , all works perfect ) :

<ipython-input-4-e5a361476e76> in <module>
     12     df = pd.read_csv(ruta)
     13     datos[usefile] = df
---> 14 data = pd.concat(datos.values(), keys=datos.keys() , sort='True')
     15 

~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
     98         new_idx_labels = {
     99             keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100             for i in range(len(objs))
    101         }
    102         print(new_idx_labels)

~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in <dictcomp>(.0)
     98         new_idx_labels = {
     99             keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100             for i in range(len(objs))
    101         }
    102         print(new_idx_labels)

TypeError: 'dict_keys' object is not subscriptable

Answer 1

This is behavior that is unintentionally not yet supported in Modin (version 0.4) based on an assumption that the keys and objs parameters are subscriptable.

The last line in your code can be changed as a workaround until it is fixed in Modin:

data = pd.concat(list(datos.values()), keys=list(datos.keys()) , sort='True')

I created an issue on the Modin repo to track the issue: https://github.com/modin-project/modin/issues/557

My code is running properly in pandas, but not in modin

Question

I expect the output of a dataframe with all files concatenate from dict, but y recive ( in pandas , all works perfect ) :

1 answers

solution1
1 2019-04-16 07:40:07

My code is running properly in pandas, but not in modin

Question

I expect the output of a dataframe with all files concatenate from dict, but y recive ( in pandas , all works perfect ) :

1 answers

solution1 1 2019-04-16 07:40:07

solution1
1 2019-04-16 07:40:07