關系類別變量的圖形表示

Question

我正在尋找一些想法，以更好地說明分類變量之間的關系。

對於可復制的數據，我有以下內容：

t1 <- data.frame(A = c("Apple", "Rose, Apple", "Country"), 
                 B = c("Fruit", "Plant", "Peru, Japan"))

輸出量

            A           B
1       Apple       Fruit
2 Rose, Apple       Plant
3     Country Peru, Japan

您會看到蘋果與水果和植物有關。 是否有很好的圖形解決方案以熱圖格式彩色顯示各個變量？

Answer 1

我會想到這樣的事情：

library(data.table)

dt <- data.table(type = as.factor(c("Apple", "Rose", "Apple", "Rose", "Apple")),
                 type2 = as.factor(c("Fruit", "Plant", "Plant", "Tree", "Tree")))

首先，我們得到了具有不同組合的表格：

dt 
    type type2
1: Apple Fruit
2:  Rose Plant
3: Apple Plant
4:  Rose  Tree
5: Apple  Tree

然后我們得到一些統計數據（計數和相對百分比）：

dt2 <- dt[ , .(count = .N), by = .(type, type2)]

dt2[ , percentage.count := count / sum(count) * 100 , by = "type"]

dt2

    type type2 count percentage.count
1: Apple Fruit     1         33.33333
2:  Rose Plant     1         50.00000
3: Apple Plant     1         33.33333
4:  Rose  Tree     1         50.00000
5: Apple  Tree     1         33.33333

在這里我們可以看到， apple是有關1/3隨着時代Fruit ， 1/3的時間用Plant和1/3的與時俱進Tree 。

可以這樣繪制：

ggplot(data = dt2,
       aes(x = type, fill = type2)) +
  geom_bar(position = "fill")

這就像有一個“餅”，即我們擁有多少個具有相同type行type2組合，但至少可以看出哪些類型比其他類型更相關。

關系類別變量的圖形表示

問題描述

1 個解決方案

解決方案1
1 2019-01-18 16:25:30

關系類別變量的圖形表示

問題描述

1 個解決方案

解決方案1 1 2019-01-18 16:25:30

解決方案1
1 2019-01-18 16:25:30