Is there a simple function (preferably user-written, or found in base R) that takes any number of vectors, and produces a dataframe whist retaining the vectors' data types, and using the vector variables' names as the column names?
Inputs (vectors)
> var_a # character
[1] "a" "b" "c"
> var_b # numeric
[1] 1 3 4
> var_c # factor
[1] red black black
Levels: black red
Desired output
var_a var_b var_c
1 a 1 red
2 b 3 black
3 c 4 black
where the classes are
sapply(my_dataframe, class)
# var_a var_b var_c
#"character" "numeric" "factor"
cbind
Using cbind
will produce a matrix (with a single data type) - so this method does not maintain the vectors' original data types (it changes all columns to character)
first_method <- cbind(var_a, var_b, var_c)
do.call
(similar to here )In this case the data types are lost and so are the names of the vector variables
ls <- list(var_a, var_b, var_c)
second_method <- data.frame(do.call(cbind, ls))
second_method %>% sapply(class)
# X1 X2 X3
# "factor" "factor" "factor"
data.frame
This method gets close (it retains the vector names as column names in the dataframe), but unfortunately it converts character data types into factors
third_method <- data.frame(var_a, var_b, var_c)
third_method %>% sapply(class)
# var_a var_b var_c
# "factor" "numeric" "factor"
This returns the desired output, however, it is not eloquent, instead taking a lot of manual coding for large numbers of vectors, and is prone to user error because the user must specify the datatype manually for each column
fourth_method <- data.frame("var_a"=as.character(var_a), "var_b"=as.numeric(var_b), "var_c"=as.factor(var_c), stringsAsFactors = FALSE)
fourth_method %>% sapply(class)
# var_a var_b var_c
#"character" "numeric" "factor"
Note: this , this , and this solution are unsuitable as they result in loss of data type
Also note: The vectors in this question are not named vectors as referred to in this question
At this point, I am running low on ideas and am unsure what to try next?
This works fine with data.frame
. You just need to add the argument, stringsAsFactors=FALSE
.
df = data.frame(var_a, var_b, var_c, stringsAsFactors = FALSE)
sapply(df, class)
var_a var_b var_c
"character" "numeric" "factor"
We can use tibble
to preserve the column types
library(tibble)
tibble(var_a, var_b, var_c)
# A tibble: 3 x 3
# var_a var_b var_c
# <chr> <dbl> <fct>
#1 a 1 red
#2 b 3 black
#3 c 4 black
NOTE: tibble
can be used with tidyverse
operations, but if we really require data.frame
, converting it to data.frame
would still preserve the data types
tibble(var_a, var_b, var_c) %>%
as.data.frame %>%
str
#'data.frame': 3 obs. of 3 variables:
# $ var_a: chr "a" "b" "c"
# $ var_b: num 1 3 4
# $ var_c: Factor w/ 2 levels "black","red": 2 1 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.