简体   繁体   English

我的代码在 Pandas 中正常运行,但在 modin 中没有

[英]My code is running properly in pandas, but not in modin

when i use pandas, the code works perfect ( but very slow ), and when use modin, and concat dataframe, shows me an aerror当我使用 Pandas 时,代码运行良好(但速度很慢),当使用 modin 和 concat 数据框时,显示一个错误

contador = 0
df = pd.DataFrame()
data = pd.DataFrame()

for file in range(len(files)):
    usefile = files[file]
    print("Valor Numero :" + str(contador) + " de un total de " + str((len(files))) + " archivos")
    print("Existe " + str(usefile) + " añadiendolo al DataFrame" )
    contador = contador +1
    ruta = mainpath + "/" + str(usefile) 
    df = pd.read_csv(ruta)
    datos[usefile] = df
data = pd.concat(datos.values(), keys=datos.keys() , sort='True')

I expect the output of a dataframe with all files concatenate from dict, but y recive ( in pandas , all works perfect ) :我希望所有文件都从 dict 连接的数据帧的输出,但 y 接收(在熊猫中,一切正常):

<ipython-input-4-e5a361476e76> in <module>
     12     df = pd.read_csv(ruta)
     13     datos[usefile] = df
---> 14 data = pd.concat(datos.values(), keys=datos.keys() , sort='True')
     15 

~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
     98         new_idx_labels = {
     99             keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100             for i in range(len(objs))
    101         }
    102         print(new_idx_labels)

~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in <dictcomp>(.0)
     98         new_idx_labels = {
     99             keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100             for i in range(len(objs))
    101         }
    102         print(new_idx_labels)

TypeError: 'dict_keys' object is not subscriptable

This is behavior that is unintentionally not yet supported in Modin (version 0.4) based on an assumption that the keys and objs parameters are subscriptable.基于keysobjs参数是可下标的假设,这是 Modin(版本 0.4)无意中尚未支持的行为。

The last line in your code can be changed as a workaround until it is fixed in Modin:代码中的最后一行可以作为一种解决方法进行更改,直到它在 Modin 中得到修复:

data = pd.concat(list(datos.values()), keys=list(datos.keys()) , sort='True')

I created an issue on the Modin repo to track the issue: https://github.com/modin-project/modin/issues/557我在 Modin repo 上创建了一个问题来跟踪问题: https : //github.com/modin-project/modin/issues/557

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM