[英]Dataframe.loc returns dictionary or a Dataframe [Solved] (Cannot handle a non-unique multi-index!)
I had two dataframes that are being read from two almost identical.csv using pd.read_csv().我有两个数据帧正在使用 pd.read_csv() 从两个几乎相同的 .csv 中读取。
When I use.loc[index1] on one of them it returns a Dictionary such as: col1 val1 col2 val2 col3 val3 Name: (index1), dtype: object当我在其中一个上使用 .loc[index1] 时,它会返回一个字典,例如:col1 val1 col2 val2 col3 val3 Name: (index1), dtype: object
But with the other I've realized it actually returns a Dataframe. Some operations such as df1[col1] = df2[col2] + constant
will through errors.但是对于另一个我已经意识到它实际上返回了 Dataframe。一些操作如
df1[col1] = df2[col2] + constant
将通过错误。
To make it even harder I'm actually using MultiIndex.为了让它更难,我实际上使用了 MultiIndex。 I'm getting this error:
Cannot handle a non-unique multi-index!
我收到此错误:
Cannot handle a non-unique multi-index!
I've figured out that.loc returns a Dataframe or an Dictionary-like object depending on if there are duplicated indexes.我发现 .loc 返回一个 Dataframe 或一个类似字典的 object 取决于是否有重复的索引。 This condition is not explained in the pandas documentation or I've not find it.
pandas 文档中没有解释这种情况,或者我没有找到它。
If the index are actually unique try using something along this code: df.reset_index().drop_duplicates(subset=["index1"]).set_index(["index1"])
or just df.drop_duplicates(subset=["index1"])
after reading the csv but before setting the index如果索引实际上是唯一的,请尝试使用以下代码:
df.reset_index().drop_duplicates(subset=["index1"]).set_index(["index1"])
或df.drop_duplicates(subset=["index1"])
在读取 csv 之后但在设置索引之前
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.