简体   繁体   中英

extract a single column

I have a list of 701 given csv files. Each one has the same number of columns (7) but different number of rows (between 25000 and 28000).

Here is an extract of the first file:

Date,Week,Week Day,Hour,Price,Volume,Sale/Purchase
18/03/2011,11,5,1,-3000.00,17416,Sell
18/03/2011,11,5,1,-1001.10,17427,Sell
18/03/2011,11,5,1,-1000.00,18055,Sell
18/03/2011,11,5,1,-500.10,18057,Sell
18/03/2011,11,5,1,-500.00,18064,Sell
18/03/2011,11,5,1,-400.10,18066,Sell
18/03/2011,11,5,1,-400.00,18066,Sell
18/03/2011,11,5,1,-300.10,18068,Sell
18/03/2011,11,5,1,-300.00,18118,Sell

I made a nonlinear regression of the supply curve of the ninth hour for the year 2012. The datas for 2012 are in 290. to 654. csv files.

allenamen <- dir(pattern="*.csv")
alledat <- lapply(allenamen, read.csv, header = TRUE, sep = ",", stringsAsFactors = FALSE)
h <- list()
for(i in 290:654) {
g <- function(a, b, c, d, p) {a*atan(b*p+c)+d}
f <- nlsLM(Volume ~ g(a,b,c,d,Price), data=subset(alledat[[i-289]], (Hour==9) & (Sale.Purchase == "Sell") & (!Price %in% as.character(-50:150))), start = list(a=4000, b=0.1, c=-5, d=32000))
h[[i-289]] <- coef(f)
}

This works and I get the coefficients a, b, c and d for every day in 2012.

This is the head(h) :

[[1]]
        a             b             c             d 
2.513378e+03  4.668218e-02 -3.181322e+00  2.637142e+04 

[[2]]
        a             b             c             d 
2.803172e+03  6.696201e-02 -4.576432e+00  2.574454e+04 

[[3]]
        a             b             c             d 
 3.298991e+03  5.817949e-02 -3.425728e+00  2.393888e+04 

[[4]]
        a             b             c             d 
 2.150487e+03  3.810406e-02 -2.658772e+00  2.675609e+04 

[[5]]
        a             b             c             d 
2.326199e+03  3.044967e-02 -1.780965e+00  2.604374e+04 

[[6]]
        a             b             c             d 
2934.0193270     0.0302937    -1.9912913 26283.0300823

And this is dput(head(h)) :

list(structure(c(2513.37818972349, 0.0466821822063123, -3.18132213466142, 
26371.4241646124), .Names = c("a", "b", "c", "d")), structure(c(2803.17230054557, 
0.0669620116294894, -4.57643230249848, 25744.5376725213), .Names = c("a", 
"b", "c", "d")), structure(c(3298.99066895304, 0.0581794881246528, 
-3.42572804902504, 23938.8754575156), .Names = c("a", "b", "c", 
"d")), structure(c(2150.48734655237, 0.0381040636898022, -2.65877160023262, 
26756.0907073567), .Names = c("a", "b", "c", "d")), structure(c(2326.19873555633, 
0.0304496684589379, -1.7809654498454, 26043.735374657), .Names = c("a", 
"b", "c", "d")), structure(c(2934.01932702805, 0.0302937043170001, 
-1.99129130343521, 26283.0300823458), .Names = c("a", "b", "c", 
"d")))

Now I am trying to get just a column with h$a but I get NULL. How can I get just the a column?

In addition to this I want to plot the single coefficients and Date . I tried this code:

koeffreihe <- function(x) {
files <- list.files(pattern="*.csv")    
df <- data.frame()  
for(i in 1:length(files)){
xx <- read.csv(as.character(files[i]))    
xx <- subset(xx, Sale.Purchase == "Sell" & Hour == 3)
df <- rbind(df, xx)
g <- function(a, b, c, d, p) {a*atan(b*p+c)+d}
f <- nlsLM(Volume ~ g(a,b,c,d,Price), data=subset(alledat[[i]], (Hour==9) & (Sale.Purchase == "Sell") & (!Price %in% as.character(-50:150))), start = list(a=4000, b=0.1, c=-5, d=32000))
h[[i]] <- coef(f)  
}
df$Date <- as.Date(as.character(df$Date), format="%d/%m/%Y")
plot(h$x ~ Date, df, xlim = as.Date(c("2012-01-01", "2012-12-31")))
}

koeffreihe(a)

But I get this error:

invalid type (NULL) for variable 'h$x'

So the problem is that h$a is NULL. If someone can fix this problem I guess the code will work too.

Thank you for your help!

First transform your list into a data.frame:

h.df <- setNames(do.call(rbind.data.frame, h), names(h[[1]]))
#         a          b         c        d
#1 2513.378 0.04668218 -3.181322 26371.42
#2 2803.172 0.06696201 -4.576432 25744.54
#3 3298.991 0.05817949 -3.425728 23938.88
#4 2150.487 0.03810406 -2.658772 26756.09
#5 2326.199 0.03044967 -1.780965 26043.74
#6 2934.019 0.03029370 -1.991291 26283.03

Then you can extract variables easily:

h.df$a
#[1] 2513.378 2803.172 3298.991 2150.487 2326.199 2934.019

Alternatively you can iterate over the list to extract the variable:

sapply(h, "[", "a")
#       a        a        a        a        a        a 
#2513.378 2803.172 3298.991 2150.487 2326.199 2934.019 

In this line, although x is a variable, h$x is looking for a column named x in h :

plot(h$x ~ Date, df, xlim = as.Date(c("2012-01-01", "2012-12-31")))

You probably want h[[x]] instead.

From ?'[[' :

x$name is equivalent to x[["name", exact = FALSE]].

That is, you are looking for a column literally named x .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM