简体   繁体   中英

How do I set column names to lower case for multiple dataframes?

I have a set of dataframes with the same column headings, except that some of the column names are in upper case and some are in lower case. I want to convert all the column names to lowercase so that I can make one big dataframe of everything.

I can't seem to get colnames() to work in any loop or apply I write. With:

#create dfs
df1<-data.frame("A" = 1:10, "B" = 2:11)
df2<-data.frame("a" = 3:12, "b" = 4:13)
df3<-data.frame("a" = 5:14, "b" = 6:15)
#I have many more dfs in my actual data

#make list of dfs, define lowercasing function, apply across df list
dfs<-ls(pattern = "df")
lowercols<-function(df){colnames(get(df))<-tolower(colnames(get(df)))}
lapply(dfs, lowercols)

I get the following error:

Error in colnames(get(df)) <- tolower(colnames(get(df))) : 
  could not find function "get<-"

How do I change all my dataframes to have lowercase column names?

The following should work:

dfList <- lapply(lapply(dfs,get),function(x) {colnames(x) <- tolower(colnames(x));x})

Problems like this generally stem from the fact that you haven't placed all your data frames in a single data structure, and then are forced to use something awkward, like get .

Not that in my code, I use lapply and get to actually create a single list of data frames first , and then alter their colnames.

You should also be aware that your lowercols function is rather un-R like. R functions generally aren't called in such a way that they return nothing, but have side effects. If you try to write functions that way (which is possible) you will probably make your life difficult and have scoping issues. Note that in my second lapply I explicitly return the modified data frame.

@joran's answer overlaps mine heavily, both in style and in "you probably want to do this differently" message. However, in the spirit of "give a man a fish and you feed him for a day; give him a sharp stick, and he can poke himself in the eye" ...

Here's a function that does what you want in the way that (you think) you want to do it:

dfnames <- ls(pattern = "df[0-9]+")  ## avoid 'dfnames' itself
lowercolnames <- function(df) {
    x <- get(df)
    colnames(x) <- tolower(colnames(x))
    ## normally I would use parent.frame(), but here we
    ##  have to go back TWO frames if this is used within lapply()
    assign(df,x,sys.frame(-2))
    ## OR (maybe simpler)
    ## assign(df,x,envir=.GlobalEnv)

    NULL
}

Here are two alternate functions that lowercase column names and return the result:

lowerCN2 <- function(x) {
    colnames(x) <- tolower(colnames(x))
    x
}

I include plyr::rename here for completeness, although in this case it's actually more trouble than it's worth.

lowerCN3 <- function(x) {
    plyr::rename(x,structure(tolower(colnames(x)),
                             names=colnames(x)))
}

dflist <- lapply(dfnames,get)
dflist <- lapply(dflist,lowerCN2)
dflist <- lapply(dflist,lowerCN3)

This doesn't directly answer your question, but it may solve the problem you're trying to solve; you can merge data.frames by different names via something like:

df1 <- data.frame("A" = 1:10, "B" = 2:11, x=letters[1:10])
df2 <- data.frame("a" = 3:12, "b" = 4:13, y=LETTERS[1:10])
merge(df1, df2, by.x=c("A","B"), by.y=c("a","b"), all=TRUE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM