简体   繁体   中英

print data.frame column and color by type

In my Knitted document, I am trying to print a dataframe's column. Just to aid the visualization, I would like to change the output color to change based on the value of another column. I have a simple example as follows.

date_vector <- rep(NA, 10)
type_vector <- rep(NA, 10)
types <- c("A", "B")
CDate <- Sys.Date()
date_vector[1] <- as.character(CDate)
type_vector[1] <- sample(types, size = 1)
for (i in 2:10) {
  CDate <- as.Date(CDate) + rexp(n = 1, rate = 1/5)
  date_vector[i] <- as.character(CDate)
  type_vector[i] <- sample(types, size = 1)
}

test_df <- data.frame(Date=date_vector, Type=type_vector)

When I print test_df$Date , I see the following

date_vector
[1] "2016-01-06" "2016-01-07" "2016-01-22" "2016-01-28" "2016-01-29" "2016-02-01" "2016-02-04"
[8] "2016-02-12" "2016-02-13" "2016-02-15"

Instead, would like to see the following

在此输入图像描述

Since the type of the entries was as follows

type_vector
[1] "A" "A" "B" "B" "A" "A" "B" "A" "B" "A"

So blue represents dates with type A and green represents dates with type B .

This answer is more general than the question. The question asks for a way to color one column of a data frame depending on another column. This answer addresses the more general case of highlighting elements in a vector depending on a second, logical vector indicating which elements to highlight.


In principle, this is quite trivial: Print a vector, highlighting the elements indicated by another, logical vector. Highlighting x can be as simple as wrapping it in \\\\textcolor{blue}{x} or \\\\emph{x} .

In practice, it's not that simple … print(x) does a lot of useful things: It arranges the data nicely in columns, adds quotes around character data, wraps the output to respect getOption("width) , adds the index of the first element to each line of output, and so on. The problem is, we cannot use print to print the highlighted data because print escapes the backslashes in \\\\textcolor . The standard solution to this issue is to use cat instead of print . However, cat does not apply any of the nice formattings listed above.

So the challenge is to write a function that reproduces some/the required features of print . This is a quite involved task, so I limit myself to the following main features:

  • Total line width <= getOption("width") .
  • Automatically add quotes around non-numeric and non-logical values (if quote is not set).
  • Add the index of the first element to each line of output (if printIndex = TRUE ).
  • Apply rounding to numeric input ( digits ).

Plus, these two highlighting-features:

  • Wrap elements of x indicated by condition in a "highlighting pattern"
  • Do not take the highlighting pattern into account when calculating the line width. This assumes that highlighting adds only markup but no visible output.

Note that this function lacks important features of print like handling missing values. Besides, it converts the input x to character (via as.character ). The result of this might be different than with print because the S3 methods ( print.* ) corresponding to the input class are not used at all.

printHighlighted <- function(x, condition = rep(FALSE, length(x)), highlight = "\\emph{%s}", printIndex = TRUE, width = getOption("width"), digits = getOption("digits"), quote = NULL) {

  stopifnot(length(x) == length(condition))
  stopifnot(missing(digits) || (!missing(digits) && is.numeric(x))) # Raise error when input is non-numeric but "digits" supplied.

  if (missing(quote)) {
    if (is.numeric(x) || is.logical(x)) {
      quote <- FALSE
    } else {
      quote <- TRUE
    }
  }

  nquotes <- 0

  if (!printIndex) {
    currentLineIndex <- ""
  }

  if (is.numeric(x)) {
    x <- round(x, digits = digits)
  }

  fitsInLine <- function(x, elementsCurrentLine, currentLineIndex, nquotes, width) {
    return(sum(nchar(x[elementsCurrentLine])) + # total width of elements in current line
             nchar(currentLineIndex) + # width of the index of the first element (if shown)
             sum(elementsCurrentLine) - 1 + # width of spaces between elements
             nquotes <= # width of quotes added around elements
             width)
  }

  x <- as.character(x)
  elementsCurrentLine <- rep(FALSE, times = length(x))


  for (i in seq_along(x)) {

    if (!any(elementsCurrentLine) && printIndex) { # this is a new line AND show index
      currentLineIndex <- sprintf("[%s] ", i)
    }

    elementsCurrentLine[i] <- TRUE # Add element i to current line. Each line holds at least one element. Therefore, if i is the first element of this line, add it regardless of line width. If there already are elements in the line, the previous loop iteration checked that this element will fit.

    if (i < length(x)) { # not the last element

      # check whether next element will fit in this line
      elementsCurrentLineTest <- elementsCurrentLine
      elementsCurrentLineTest[i + 1] <- TRUE

      if (quote) {
        nquotes <- sum(elementsCurrentLineTest) * 2
      }

      if (fitsInLine(x, elementsCurrentLineTest, currentLineIndex, nquotes, width)) {
        next # Next element will fit; do not print yet.
      }
    }

    # Next element won't fit in current line. Print and start a new line.

    # print
    toPrint <- x[elementsCurrentLine]
    toMarkup <- condition[elementsCurrentLine]

    toPrint[toMarkup] <- sprintf(fmt = highlight, toPrint[toMarkup]) # add highlighting

    if (quote) {
      toPrint <- sprintf('"%s"', toPrint)
    }

    cat(currentLineIndex)
    cat(toPrint)
    cat("\n")

    # clear line
    elementsCurrentLine <- rep(FALSE, times = length(x))
  }
}

To use this function with knitr , the chunk option results = "asis" must be used because otherwise the output is wrapped in a verbatim environment where the markup responsible for the highlighting is displayed instead of used .

Finally, to reproduce the look of normal chunks, wrap the whole chunk in

\begin{knitrout}
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}
\begin{kframe}
\begin{alltt}
<<your-chunk>>=
printHighlighted(...)
@
\end{alltt}
\end{kframe}
\end{knitrout}

Example

To save some space, the example assumes that the function definition of printHighlighted is available in a file printHighlighted.R .

\documentclass{article}
\begin{document}

Some text ....

\begin{knitrout}\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}\begin{alltt}
<<results = "asis", echo = FALSE>>=
source("printHighlighted.R")
data <- seq(from = as.Date("2015-01-15"), by = "day", length.out = 100)
cond <- rep(FALSE, 100)
cond[c(3, 55)] <- TRUE

printHighlighted(x = data, condition = cond, highlight = "\\textcolor{blue}{%s}", width = 60)
@
\end{alltt}\end{kframe}\end{knitrout}

Some text ....

\end{document}

产量


This turned out to be quite lenghty ... if someone thinks this was an overkill for such a simple question, I'd love to see shorter solutions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM