简体   繁体   中英

Combine csv files of different formats and make into one excel with different sheets

I have four csv files with different formats and variables, combining these 4 CSV files into one excel file using below code

library(rJava)
library(xlsx)
rm(list = ls())

# getting the path of all reports (they are in csv format)
files <- list.files(pattern = "\\.csv$")

# creating work book
wb <- createWorkbook()

# going through each csv file
for (item in files)
{
    # create a sheet in the workbook
    sheet <- createSheet(wb, sheetName=strsplit(item,"[.]")[[1]][1])

    # add the data to the new sheet
    addDataFrame(read.csv(item), sheet,row.names=FALSE)
}


# saving the workbook
saveWorkbook(wb, "crosstabs of data.xlsx")

In csv file one sheet the variable name is source / Medium But it is appeared in output excel file as Source...Medium, % New Sessions variable is appeared as X..New.Sessions and all variables delimited space occupied with . in output excel file How to overcome this i need what ever the variable names in CSV files same as in output Excel file

This problem is due to read.csv changing names of headers. Column headers like gi/joe will be converted in gi.joe if we do read.csv with header=T . So one need to convert just the header names again using:

names(df) <- gsub("\\.","/",names(df))

OR if acceptable do simply (read headers as data):

addDataFrame(read.csv(item,header=F), sheet,row.names=FALSE)

On a separate note looks like names like gi/joe are not allowed as excel sheet names. Now to validate limitation in excel end open excel and try to name a sheet hi/5 . One should get error saying The sheet name contains invalid characters: : \\ / ? * [ ]. The sheet name contains invalid characters: : \\ / ? * [ ]. [I am testing this on mac excel 15.19.1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM