简体   繁体   中英

Reaching memory limit in R

I am having a strange issue in R.

I have a large data.table dataTs1 :

Classes ‘data.table’ and 'data.frame':  419172 obs. of  5 variables:
 $ TimeStamp: chr  "01MAR13:07:15:00" "01MAR13:07:16:00" "01MAR13:07:18:00" ...
 $ col1     : chr  "ALL1" "ALL1" "ALL1" "ALL1" ...
 $ col2     : int  NA NA NA NA NA NA NA NA NA NA ...
 $ col3     : int  4 4 4 4 4 4 4 4 4 4 ...
 $ col4     : int  621 810 4 4 8 1 3 1 1 1 ...

I loaded this table using fread function.

The memory allocation seems ok.

> memory.size(max=TRUE)
[1] 82.94

I tried to modify the class of the first line to POSIX so I wrote :

dataTs1$TimeStamp <- strptime(dataTs1$TimeStamp,"%d%b%y:%H:%M:%S")

And with this line I get reach my memory limit of 16G... but when I write :

test <- 1:length(dataTs1$TimeStamp)
dataTs1$TimeStamp <- test

it works perfectly without any memory overload.

I am pretty new with R and I would be grateful If you could help me figuring out what I'm doing wrong here.

Thx


EDIT :

Actually I get a strange warning sometimes when I don't get a memory overload :

>dataTs1[,TimeStamp:=strptime(TimeStamp,"%d%b%y:%H:%M:%S")]
Warning messages:
1: In `[<-.data.table`(x, j = name, value = value) :
  Supplied 9 items to be assigned to 419172 items of column 'TimeStamp' (recycled leaving remainder of 6 items).
2: In `[<-.data.table`(x, j = name, value = value) :
  Coerced 'list' RHS to 'character' to match the column's type. Either change the target column to 'list' first (by creating a new 'list' vector length 419172 (nrows of entire table) and assign that; i.e. 'replace' column), or coerce RHS to 'character' (e.g. 1L, NA_[real|integer]_, as.*, etc) to make your intent clear and for speed. Or, set the column type correctly up front when you create the table and stick to it, please.
> str(dataTs1)
Classes ‘data.table’ and 'data.frame':  419172 obs. of  5 variables:
 $ TimeStamp: chr  "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ ...
 $ V6FCDSB  : chr  "ALL1" "ALL1" "ALL1" "ALL1" ...
 $ V6FCDTD  : int  NA NA NA NA NA NA NA NA NA NA ...
 $ _TYPE_   : int  4 4 4 4 4 4 4 4 4 4 ...
 $ N        : int  621 810 4 4 8 1 3 1 1 1 ...
 - attr(*, ".internal.selfref")=<externalptr> 

POSIXlt is not supported by data.table and never will be. "The no-support for POSIXlt is set in stone"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM