when i use pandas, the code works perfect ( but very slow ), and when use modin, and concat dataframe, shows me an aerror
contador = 0
df = pd.DataFrame()
data = pd.DataFrame()
for file in range(len(files)):
usefile = files[file]
print("Valor Numero :" + str(contador) + " de un total de " + str((len(files))) + " archivos")
print("Existe " + str(usefile) + " añadiendolo al DataFrame" )
contador = contador +1
ruta = mainpath + "/" + str(usefile)
df = pd.read_csv(ruta)
datos[usefile] = df
data = pd.concat(datos.values(), keys=datos.keys() , sort='True')
<ipython-input-4-e5a361476e76> in <module>
12 df = pd.read_csv(ruta)
13 datos[usefile] = df
---> 14 data = pd.concat(datos.values(), keys=datos.keys() , sort='True')
15
~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
98 new_idx_labels = {
99 keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100 for i in range(len(objs))
101 }
102 print(new_idx_labels)
~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in <dictcomp>(.0)
98 new_idx_labels = {
99 keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100 for i in range(len(objs))
101 }
102 print(new_idx_labels)
TypeError: 'dict_keys' object is not subscriptable
This is behavior that is unintentionally not yet supported in Modin (version 0.4) based on an assumption that the keys
and objs
parameters are subscriptable.
The last line in your code can be changed as a workaround until it is fixed in Modin:
data = pd.concat(list(datos.values()), keys=list(datos.keys()) , sort='True')
I created an issue on the Modin repo to track the issue: https://github.com/modin-project/modin/issues/557
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.