I have a dataframe like this
df = pd.DataFrame({"true_key" :["Astral","Blob","Blob","Cat","Astral"], "true_key2": ["Japan","Astral","Blob","quics","Cat"]})
How do I calculate the percentage of values present in true_key that are present in true_key2 and vice versa?
So, as we can see 100% of true_key values are present in true_key2. And 60% of true_key2 are present in true_key
Is there any other method to do it in Python?
Thanks in advance.
one way would be to use set intersection and divide len accordingly:
mutual_len = len(set(df['true_key']).intersection(set(df['true_key2'])))
mutual_len / df['true_key'].nunique(), mutual_len / df['true_key2'].nunique()
(1.0, 0.6)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.