简体   繁体   中英

R Hadoop Header = TRUE

Is it possible to use the option "Header=TRUE" while reading a csv residing in Hadoop from R? The csv contains the first row of Column Headers. I have used the R code

predictor <- from.dfs("hdfs://3.48.34.16:8020/user/lg337358/Predictor.csv",make.input.format(format="csv",sep=","))

It is reading the file fine. But the column headers are coming as the first row in "predictor" while I want them in "colnames(predictor)". I tried the option

predictor <- from.dfs("hdfs://3.48.34.16:8020/user/lg337358/Predictor.csv",make.input.format(format="csv",header = TRUE,sep=","))

But that is giving error.

I also got the same error. I have used the below code to get the header. After reading the hdfs file the first row will be the column names probably.

df<-read.hdfs("/usr/hadoop/df.csv")#read the hdfs file
df_names<-df[1,] #extract the header
df<-df[-1,] # delete the row which contains the header from the data
colnames(df)<-df_names #set the column names(header) of the data

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM