I would like to split a dataframe into a list of dataframes and keep the classes of the variables.
# create sample data
df <- data.frame(
id=c("1","2"),
site_name = c("Zero Hedge", "Free Software Foundation"),
site_url = c("https://www.zerohedge.com", "https://www.fsf.org")
)
# specify class for site_url
class(df$site_url) <- "formula"
# split
dataframes <- split(df, df$id)
Now I wonder, why the splitted data changed the class:
class(dataframes[[1]]$site_url)
[1] "character"
My questions:
Thank you for your help.
Additional info:
I came across this problem when I wanted to automatically write hyperlinks to excel files with R and openxlsx
according to this very helpful post: Openxlsx hyperlink output display in Excel
We can set the attributes
dataframes2 <- lapply(dataframes, function(x) {
attributes(x$site_name) <- attributes(df$site_name)
x})
The issue is not related to split
or methods of it. In this case, it is split.data.frame
. If we look at the source code, it is splitting based on the sequence of rows based on the grouping 'f' and then doing the extraction ( [
)
split.data.frame
function (x, f, drop = FALSE, ...)
lapply(split(x = seq_len(nrow(x)), f = f, drop = drop, ...),
function(ind) x[ind, , drop = FALSE])
But, the split.data.table
keeps the class
split(as.data.table(df), df$id) %>% str
#List of 2
# $ 1:Classes ‘data.table’ and 'data.frame': 1 obs. of 3 variables:
# ..$ id : chr "1"
# ..$ site_name: chr "Zero Hedge"
# ..$ site_url : 'formula' chr "https://www.zerohedge.com"
# ..- attr(*, ".internal.selfref")=<externalptr>
# $ 2:Classes ‘data.table’ and 'data.frame': 1 obs. of 3 variables:
# ..$ id : chr "2"
# ..$ site_name: chr "Free Software Foundation"
# ..$ site_url : 'formula' chr "https://www.fsf.org"
-checking the structure of the original data with the extracted rows data
str(df)
'data.frame': 2 obs. of 3 variables:
$ id : chr "1" "2"
$ site_name: chr "Zero Hedge" "Free Software Foundation"
$ site_url : 'formula' chr "https://www.zerohedge.com" "https://www.fsf.org" str(df[1,]) # with one row selected
'data.frame': 1 obs. of 3 variables:
$ id : chr "1"
$ site_name: chr "Zero Hedge"
$ site_url : chr "https://www.zerohedge.com" # lost attribute
str(df[1:2,]) # with more than one row
'data.frame': 2 obs. of 3 variables:
$ id : chr "1" "2"
$ site_name: chr "Zero Hedge" "Free Software Foundation"
$ site_url : chr "https://www.zerohedge.com" "https://www.fsf.org" # lost attribute
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.