r-為一段時間內的每個唯一實例ggplot多個折線圖

Question

問題

在彼此上方繪制一堆線圖，但是我只想為10專門繪制顏色，然后將它們全部繪制在彼此之間（以可視化我的“目標”隨着時間的推移如何移動，同時能夠查看它們后面的其他物體的質量。因此，隨着時間的推移，可能會出現100個折線圖，但是我想為其中的5個或10個圖上色，以就其他90個灰度圖的趨勢進行討論。

下一篇文章有一個很好的形象，我想復制，但是骨頭上的肉稍微多一點， ，除了我希望在這3個全灰階后面排很多線，但那3個是我想在前景中看到的突出顯示的城市。

我的原始數據采用以下格式：

# The unique identifier is a City-State combo, 
# there can be the same cities in 1 state or many. 
# Each state's year ranges from 1:35, but may not have
# all of the values available to us, but some are complete.

r1 <- c("city1" , "state1" , "year" , "population" , rnorm(11) , "2")
r2 <- c("city1" , "state2" , "year" , "population" , rnorm(11) , "3")
r3 <- c("city2" , "state1" , "year" , "population" , rnorm(11) , "2")
r4 <- c("city3" , "state2" , "year" , "population" , rnorm(11) , "1")
r5 <- c("city3" , "state2" , "year" , "population" , rnorm(11) , "7")

df <- data.frame(matrix(nrow = 5, ncol = 16))
df[1,] <- r1
df[2,] <- r2
df[3,] <- r3
df[4,] <- r4
df[5,] <- r5

names(df) <- c("City", "State", "Year", "Population", 1:11, "Cluster")

head(df)


#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# City | State | Year | Population  | ... 11 Variables ... | Cluster    #
# ----------------------------------------------------------------------#
# Each row is a city instance with these features ...                   #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

但是我認為最好以不同的方式查看數據，因此我也采用以下格式。 我不確定哪個對這個問題更好。

cols <- c(0:35)
rows <- c("unique_city1", "unique_city2","unique_city3","unique_city4","unique_city5")
r1 <- rnorm(35)
r2 <- rnorm(35)
r3 <- rnorm(35)
r4 <- rnorm(35)
r5 <- rnorm(35)

df <- data.frame(matrix(nrow = 5, ncol = 35))
df[1,] <- r1
df[2,] <- r2
df[3,] <- r3
df[4,] <- r4
df[5,] <- r5

names(df) <- cols
row.names(df) <- rows

head(df)


#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
#                       Year1 Year2 .......... Year 35  #
# UniqueCityState1       VAL    NA  ..........  VAL     #
# UniqueCityState2       VAL    VAL ..........  NA      #
#         .                                             #
#         .                                             #
#         .                                             #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

事先嘗試

我嘗試使用melt將數據轉換為ggplot可能接受並繪制這些城市中的每個城市的格式，但是似乎沒有任何效果。 另外，我嘗試創建自己的函數以遍歷我每個獨特的城市-州組合來stack ggplots ，這些stack ggplots對該主題有相當多的研究，但至今還沒有。 我不確定如何找到這些獨特的城市狀態對中的每對，並隨時間將其繪制為聚類值或與此相關的任何數值。 也許我不確定，我不確定。

思考？

編輯：有關數據結構的更多信息

> head(df)
        city state year population    stat1 stat2 stat3 stat4 stat5
1       BESSEMER     1    1      31509 0.3808436            0 0.63473928   2.8563268    9.5528262
2     BIRMINGHAM     1    1     282081 0.3119671            0 0.97489728   6.0266377    9.1321287
3 MOUNTAIN BROOK     1    1      18221 0.0000000            0 0.05488173   0.2744086    0.4390538
4      FAIRFIELD     1    1      12978 0.1541069            0 0.46232085   3.0050855    9.8628448
5     GARDENDALE     1    1       7828 0.2554931            0 0.00000000   0.7664793    1.2774655
6          LEEDS     1    1       7865 0.2542912            0 0.12714558   1.5257470   13.3502861
  stat6 stat6 stat7 stat8 stat9 cluster
1     26.976419     53.54026  5.712654                    0               0.2856327       9
2     35.670605     65.49183 11.982374                    0               0.4963113       9
3      6.311399     21.40387  1.426925                    0               0.1097635       3
4     21.266759     68.11527 11.480968                    0               1.0787487       9
5      6.770567     23.24987  3.960143                    0               0.0000000       3
6     24.157661     39.79657  4.450095                    0               1.5257470      15
    agg
1  99.93970
2 130.08675
3  30.02031
4 115.42611
5  36.28002
6  85.18754

最終，我需要以row.names ，1:35作為col.names作為唯一城市的形式，如果存在該年份，則在每個單元格中的值row.names agg否則則為NA 。 再一次，我確信這是可能的，只是我無法獲得一個好的解決方案，而我目前的方式是不穩定的。

Answer 1

如果我正確理解了您的問題，則希望以一種顏色繪制所有線條，然后以幾種不同的顏色繪制幾條線條。 您可以使用ggplot2 ，在兩個數據幀上兩次調用geom_line 。 第一次繪制所有城市數據，而沒有將線映射到顏色。 第二次繪制只是目標城市的子集，並將線映射到顏色。 您將需要重新組織原始數據框，並為目標城市划分數據框的子集。 在下面的代碼中，我使用了tidyr和dplyr處理數據幀。

### Set.seed to improve reproducibility
set.seed(123)

### Load package
library(tidyr)
library(dplyr)
library(ggplot2)

### Prepare example data frame 
r1 <- rnorm(35)
r2 <- rnorm(35)
r3 <- rnorm(35)
r4 <- rnorm(35)
r5 <- rnorm(35)

df <- data.frame(matrix(nrow = 5, ncol = 35))
df[1,] <- r1
df[2,] <- r2
df[3,] <- r3
df[4,] <- r4
df[5,] <- r5 

names(df) <- 1:35

df <- df %>% mutate(City = 1:5)

### Reorganize the data for plotting
df2 <- df %>%
  gather(Year, Value, -City) %>%
  mutate(Year = as.numeric(Year))

gather函數將df作為第一個參數。 它將創建名為Year的key列，該key列將存儲年號。 年號是df數據框中除“ City列以外的每一列的列名。 gather函數還將創建一個名為Value的列，該列將存儲除City列之外的df數據框中每個列的所有數值。 最后， City列不會在這個過程中涉及，所以用-City告訴gather功能“不從轉換數據City列”。

### Subset df2, select the city of interest
df3 <- df2 %>%
  # In this example, assuming that City 2 and City 3 are of interest
  filter(City %in% c(2, 3))

### Plot the data
ggplot(data = df2, aes(x = Year, y = Value, group = factor(City))) +
  # Plot all city data here in gray lines
  geom_line(size = 1, color = "gray") +
  # Plot target city data with colors
  geom_line(data = df3, 
            aes(x = Year, y = Value, group = City, color = factor(City)),
            size = 2)

結果圖可以在這里看到： https : //dl.dropboxusercontent.com/u/23652366/example_plot.png

r-為一段時間內的每個唯一實例ggplot多個折線圖

問題描述

問題

事先嘗試

編輯：有關數據結構的更多信息

1 個解決方案

解決方案1
2 2017-03-09 03:27:29

r-為一段時間內的每個唯一實例ggplot多個折線圖

問題描述

問題

事先嘗試

編輯：有關數據結構的更多信息

1 個解決方案

解決方案1 2 2017-03-09 03:27:29

解決方案1
2 2017-03-09 03:27:29