简体   繁体   中英

In R, find data frame indeces of multiple columns using custom sort vectors to control ggplot2 plotting

I would like to control plotting of values in v3 (in df below) using ggplot2 according to the factor levels (of v4 ), ie. the order of plotting.

df <- 
  data.frame(
    v1=c("a","b","c","a"),
    v2=c("z", "x", "x", "y"),
    v3=c(1,2,3,4),
    v4=factor(c("id1", "id2", "id3", "id4")))

require(ggplot2)
ggplot(df, aes(x=v4,y=v3))+
  geom_bar(stat="identity",position="dodge")

在此处输入图片说明

To change the order of v4 in the plot comes down to specifying the levels of v4 . If I would like to plot according to v1 its straith-forward to change the levels:

df$v4 <- with(df,factor(v4, levels= v4[order(df$v1)]))
ggplot(df, aes(x=v4,y=v3))+
  geom_bar(stat="identity",position="dodge")

在此处输入图片说明

And since both "id1" and "id4" have the v1 value "a" we could choose to resolve this tie by a second vector, say v2 , in the argument to order() :

df$v4 <- with(df,factor(v4, levels= v4[order(df$v1, df$v2)]))
ggplot(df, aes(x=v4,y=v3))+
  geom_bar(stat="identity",position="dodge")

在此处输入图片说明

You can change the "decreasing=F" argument in order() to reverse the sort order, but how can you specify a custom order to sort by. Eg if you above do not want the alphabetical order of v1 but rather c>b>a (first c, then b, then a) ? Obviously subsetting using match (along: df[match(c("c","b","a"),df$v1),"v4"]) only works if you have unique values in v1 . I seem to miss a "by" argument to order() like: order(df$v1, df$v2, by=c(s1,s2)) where "s1" and "s2" are vectors to find the order of v1 and v2 by (in our case "s1" <- c("c", "b", "a")). Basically I need to find the indeces of a data frame using more than one variable/column (in our df : v1 , and if ties resolve by v2 ) by using custom sort vectors (in our df this is s1 and s2 ). How can this be done ?

I just tried specifying the levels for v1 using indices and successfully got c>b>a as the order in ggplot2. I put this above your df$v4 <- with(df... and then ran the rest of the code unchanged.

df$v1 <- factor(df$v1,
                levels(df$v1)[c(3,2,1,4)])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM