[英]Not able to compare two dataframes
I have two Dataframes, df1 and df2 with same structure.我有两个具有相同结构的数据框 df1 和 df2。 I want to find common rows between them using df1.merge(df2)
but there is one row I am facing issue with:我想使用df1.merge(df2)
找到它们之间的共同行,但有一行我面临问题:
>>> df2
reference_period analyzed_domain account is_misc total_estimated_visits total_estimated_monthly_unique_visitors total_estimated_visit_duration total_estimated_pageviews estimated_deduplicated_audience
0 2017-11-01 abc xyz 0 1000 278.0 5788.0 80159.0 0.0
>>> df1=df1.head(1)
>>> df1
reference_period analyzed_domain account is_misc total_estimated_visits total_estimated_monthly_unique_visitors total_estimated_visit_duration total_estimated_pageviews estimated_deduplicated_audience
0 2017-11-01 abc xyz 0 1000 278.0 5788.0 80159.0 0.0
>>> df1==df2
reference_period analyzed_domain account is_misc total_estimated_visits total_estimated_monthly_unique_visitors total_estimated_visit_duration total_estimated_pageviews estimated_deduplicated_audience
0 True True True False True True True True True
>>> df1.dtypes
reference_period datetime64[ns]
analyzed_domain object
account object
is_misc object
total_estimated_visits object
total_estimated_monthly_unique_visitors float64
total_estimated_visit_duration float64
total_estimated_pageviews float64
estimated_deduplicated_audience float64
dtype: object
>>> df2.dtypes
reference_period datetime64[ns]
analyzed_domain object
account object
is_misc object
total_estimated_visits object
total_estimated_monthly_unique_visitors float64
total_estimated_visit_duration float64
total_estimated_pageviews float64
estimated_deduplicated_audience float64
dtype: object
I am not sure why python is not able to equate is_misc column.我不确定为什么 python 不能等同于 is_misc 列。 Could someone please help?有人可以帮忙吗? Thanks谢谢
Pandas dtype object is either str or mixed. Pandas dtype object 是 str 或混合。 So it can be either text or mixed numeric and non-numeric values.所以它可以是文本或混合的数字和非数字值。 In either df1
or df2
, the 0
value for the is_misc
column is most likely string type, so you can convert both of them to either string or int, and then run the comparison again, which will then equal True
.在df1
或df2
中, is_misc
列的0
值很可能是字符串类型,因此您可以将它们都转换为字符串或 int,然后再次运行比较,结果将等于True
。 try this:尝试这个:
df1['is_misc'] = df1['is_misc'].astype(str).astype(int)
df2['is_misc'] = df2['is_misc'].astype(str).astype(int)
And then compare again:然后再次比较:
print(df1 == df2)
Gustav Rasmussen ans will work Gustav Rasmussen ans 将工作
i got the same problem but i had a string with decimals (eg '5.0') in first dataframe and integer in 2nd dataframe ( eg 5)我遇到了同样的问题,但我在第一个 dataframe 和 integer 中有一个带小数的字符串(例如'5.0')在第二个 Z6A8064B5DF479455500553C47C55057D 中(例如)
i solved the follwing way我解决了以下方式
df1['column'] = df1['column'].astype(float).astype(int)
df2['column'] = df2['column'].astype(float).astype(int)
and compare并比较
df1==df2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.