R - combining lines from multiple CSV into a data frame

Question

I have a folder with hundreds of CSV files each containing data for a particular postal code.

Each CSV files contains two columns and thousands of rows. Descriptors are in Column A, values are in Column B.

I need to extract two pieces of information from each file and create a new table or dataframe using the values in [Column A, Row 2] (which is the postal code) and [Column B, Row 1585] (which is the median income).

The end result should be a table/dataframe with two columns: one for postal code, the other for median income.

Any help or advice would be appreciated.

Answer 1

You can use list.files function to get directories for all your files and then use read.csv and rbind in for loop to create one data.frame .

Something like this:

direct<-list.files("directory_to_your_files")
df<-NULL
for(i in length(direct)){
  df<-rbind(df,read.csv(direct[i]))
}

Answer 2

Disclaimer: this question is pretty vague. Next time, be sure to add a reproducible example that we can run on our machines. It will help you, the people answering your questions, and future users.

You might try something like:

files = list.files("~/Directory")

my_df = data.frame(matrix(ncol = 2, nrow = length(files)

for(i in 1:length(files)){
    row1 = read.csv("~/Directory/files[i]",nrows = 1)
    row2 = read.csv("~/Directory/files[i]", skip = 1585, nrows = 1)
    my_df = rbind(my_df, rbind(row1, row2))
}

my_df = my_df[,c("A","B")]
# Note on interpreting indexing syntax: 
  Read this as "my_df is now (=) my_df such that ([) the columns (,) 
  are only A and B (c("A", "B")) "

Answer 3

So here is the code which does what I want it to do. If there are more elegant solutions, please feel free to point them out.

# set the working directory to where the data files are stored
setwd("/foo")

# count the files
files = list.files("/foo")

#create an empty dataframe and name the columns

dataMatrix=data.frame(matrix(c(rep(NA,times=2*length(files))),nrow=length(files)))
colnames(dataMatrix)=c("Postal Code", "Median Income")

# create a for loop to get the information in R2/C1 and R1585/C2 of each data file
# Data is R2/C1 is a string, but is interpreted as a number unless specifically declared a string

for(i in 1:length(files)) {
  getData = read.csv(files[i],header=F)
  dataMatrix[i,1]=toString(getData[2,1])
  dataMatrix[i,2]=(getData[1585,2])
}

Thank you to all those who helped me figure this out, especially Nancy.

R - combining lines from multiple CSV into a data frame

Question

3 answers

solution1
0 2015-12-15 16:53:19

solution2
0 ACCPTED 2015-12-15 17:01:59

solution3
0 2015-12-16 18:16:55

R - combining lines from multiple CSV into a data frame

Question

3 answers

solution1 0 2015-12-15 16:53:19

solution2 0 ACCPTED 2015-12-15 17:01:59

solution3 0 2015-12-16 18:16:55

solution1
0 2015-12-15 16:53:19

solution2
0 ACCPTED 2015-12-15 17:01:59

solution3
0 2015-12-16 18:16:55