簡體   English   中英

情節注釋彼此太接近(不可讀)

[英]Plotly annotations too close to each other (not readable)

我有以下代碼可以為 PCA 后的載荷創建一個圖:

# Creating pipeline objects 
## PCA
pca = PCA(n_components=2)
## Create columntransformer to only scale a selected set of featues
categorical_ix = X.select_dtypes(exclude=np.number).columns

features = X.columns

ct = ColumnTransformer([
        ('encoder', OneHotEncoder(), categorical_ix),
        ('scaler', StandardScaler(), ['tenure', 'MonthlyCharges', 'TotalCharges'])
    ], remainder='passthrough')

# Create pipeline
pca_pipe = make_pipeline(ct,
                         pca)

# Fit data to pipeline
pca_result = pca_pipe.fit_transform(X)

loadings = pca.components_.T * np.sqrt(pca.explained_variance_)

fig = px.scatter(pca_result, x=0, y=1, color=customer_data_raw['Churn'])

for i, feature in enumerate(features):
    fig.add_shape(
        type='line',
        x0=0, y0=0,
        x1=loadings[i, 0],
        y1=loadings[i, 1]
    )
    fig.add_annotation(
        x=loadings[i, 0],
        y=loadings[i, 1],
        ax=0, ay=0,
        xanchor="center",
        yanchor="bottom",
        text=feature,
    )
fig.show()

產生以下輸出:

在此處輸入圖片說明

如何使裝載的標簽可讀?

編輯:X 中有 19 個功能。

    gender  SeniorCitizen   Partner Dependents  tenure  PhoneService    MultipleLines   InternetService OnlineSecurity  OnlineBackup    DeviceProtection    TechSupport StreamingTV StreamingMovies Contract    PaperlessBilling    PaymentMethod   MonthlyCharges  TotalCharges
customerID                                                                          
7590-VHVEG  Female  0   Yes No  1   No  No phone service    DSL No  Yes No  No  No  No  Month-to-month  Yes Electronic check    29.85   29.85
5575-GNVDE  Male    0   No  No  34  Yes No  DSL Yes No  Yes No  No  No  One year    No  Mailed check    56.95   1889.50
3668-QPYBK  Male    0   No  No  2   Yes No  DSL Yes Yes No  No  No  No  Month-to-month  Yes Mailed check    53.85   108.15
7795-CFOCW  Male    0   No  No  45  No  No phone service    DSL Yes No  Yes Yes No  No  One year    No  Bank transfer (automatic)   42.30   1840.75
9237-HQITU  Female  0   No  No  2   Yes No  Fiber optic No  No  No  No  No  No  Month-to-month  Yes Electronic check    70.70   151.65

根據您的 DataFrame,您有 19 個特征,並且您將它們全部添加到該位置作為您的線,因為 ax 和 y 都設置為 0。

我們可以在您循環遍歷特征以進行旋轉時更改axay ,這有望使您的注釋更易於區分。 這是基於使用x = r*cos(theta)y = r*sin(theta)從極坐標轉換為笛卡爾坐標,其中 theta 通過值0*360/19, 1*360/19, ... , 18*360/19 我們希望將 x 和 y 參考設置為 x 和 y 坐標而不是紙坐標,然后設置 r=2 或與您的繪圖相當的某個值(這將使注釋線長度最長為 2)

from math import sin, cos, pi
r = 2 # this can be modified as needed, and is in units of the axis
theta = 2*pi/len(features)

for i, feature in enumerate(features):
    fig.add_shape(
        type='line',
        x0=0, y0=0,
        x1=loadings[i, 0],
        y1=loadings[i, 1]
    )
    fig.add_annotation(
        x=loadings[i, 0],
        y=loadings[i, 1],
        ax=r*sin(i*theta), 
        ay=r*cos(i*theta),
        axref="x",
        ayref="y",
        xanchor="center",
        yanchor="bottom",
        text=feature,
    )

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM