I had 10 features in my dataframe. I applied chi square test and generated the p-values for all the column pairs in the dataframe. I want to represent the p-values as a cross-grid of multiple features.
Example : A, B, C are my features and p-values between (A,B) = 0.0001, (A,C) = 0.5, (B,C) = 0.0
So, I want to see this thing as:
A B C
A 1 0.001 0.5
B 0.001 1 0.0
C 0.5 0.0 1
If any other detail needed please let know.
Assuming you have list of features as features = ['A','B','C',...]
and p-values asp_values = {('A','B'):0.0001,('A','C'):0.5,...}
import pandas as pd
p_values = {('A','B'):0.0001,('A','C'):0.5}
features = ['A','B','C']
df = pd.DataFrame(columns=features)
for row in features:
rowdf = [] # prepare a row for df
for col in features:
if row == col:
rowdf.append(1) # (A,A) taken as 1
continue
try:
rowdf.append(p_values[(row,col)]) # add the value from dictionary
except KeyError:
try:
rowdf.append(p_values[(col, row)]) # look for pair like (B,A) if (A,B) not found
except KeyError: # still not found, append None
rowdf.append(None)
df.loc[len(df)] = rowdf # write row in df
df.index = features # to make row names as A,B,C ...
print(df)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.