I have some WRF output data that was subsetted and masked using pythons xarray module.
I'm now performing calculations on raster bricks using R's raster package and finding very different speeds for very similar files.
Knowns:
All calculations are on the same variable with the same values (except for the second which is masked)
system.time(sum(d97[[1:365]])) user system elapsed 5.428 2.771 8.840
The second file is the exact same file but a masked portion, with all the masked values converted to NaN.
system.time(sum(masked_d97[[1:365]]))
user system elapsed
10.784 2.157 13.052
The last file is a slightly modified version (daily values rather than cummulative values) of the first file. It was modified using Xarray in Python.
system.time(sum(mod_d97[[1:365]]))
user system elapsed
22.015 1.773 24.474
What on earth is happening here? I'm happy to provide more details (code, ncdumps, etc) as requested.
EDIT: added str() of files
d97 <- brick(files[8], varname = "TMIN")
masked_97 <- brick(files[3], varname = "TMIN")
d03 <- brick(files[11], varname = "TMIN")
str(d97)
Formal class 'RasterBrick' [package "raster"] with 12 slots
..@ file :Formal class '.RasterFile' [package "raster"] with 13 slots
.. .. ..@ name : chr "/Users/charlesbecker/Desktop/Data/Project Data/Shiny/WY1997_yearly_stats.nc"
.. .. ..@ datanotation: chr "FLT4S"
.. .. ..@ byteorder : chr "little"
.. .. ..@ nodatavalue : num NaN
.. .. ..@ NAchanged : logi FALSE
.. .. ..@ nbands : int 365
.. .. ..@ bandorder : chr "BIL"
.. .. ..@ offset : int 0
.. .. ..@ toptobottom : logi TRUE
.. .. ..@ blockrows : int 0
.. .. ..@ blockcols : int 0
.. .. ..@ driver : chr "netcdf"
.. .. ..@ open : logi FALSE
..@ data :Formal class '.MultipleRasterData' [package "raster"] with 14 slots
.. .. ..@ values : logi[0 , 0 ]
.. .. ..@ offset : num 0
.. .. ..@ gain : num 1
.. .. ..@ inmemory : logi FALSE
.. .. ..@ fromdisk : logi TRUE
.. .. ..@ nlayers : int 365
.. .. ..@ dropped : NULL
.. .. ..@ isfactor : logi FALSE
.. .. ..@ attributes: list()
.. .. ..@ haveminmax: logi FALSE
.. .. ..@ min : num [1:365] Inf Inf Inf Inf Inf ...
.. .. ..@ max : num [1:365] -Inf -Inf -Inf -Inf -Inf ...
.. .. ..@ unit : chr "K"
.. .. ..@ names : chr [1:365] "X1" "X2" "X3" "X4" ...
..@ legend :Formal class '.RasterLegend' [package "raster"] with 5 slots
.. .. ..@ type : chr(0)
.. .. ..@ values : logi(0)
.. .. ..@ color : logi(0)
.. .. ..@ names : logi(0)
.. .. ..@ colortable: logi(0)
..@ title : chr "TMIN"
..@ extent :Formal class 'Extent' [package "raster"] with 4 slots
.. .. ..@ xmin: num 0.5
.. .. ..@ xmax: num 348
.. .. ..@ ymin: num 0.5
.. .. ..@ ymax: num 328
..@ rotated : logi FALSE
..@ rotation:Formal class '.Rotation' [package "raster"] with 2 slots
.. .. ..@ geotrans: num(0)
.. .. ..@ transfun:function ()
..@ ncols : int 348
..@ nrows : int 327
..@ crs :Formal class 'CRS' [package "sp"] with 1 slot
.. .. ..@ projargs: chr NA
..@ history : list()
..@ z :List of 1
.. ..$ : int [1:365] 1 2 3 4 5 6 7 8 9 10 ...
str(masked_d97)
Formal class 'RasterBrick' [package "raster"] with 12 slots
..@ file :Formal class '.RasterFile' [package "raster"] with 13 slots
.. .. ..@ name : chr "/Users/charlesbecker/Desktop/Data/Project Data/Shiny/AVA_WY1997_yearly_stats.nc"
.. .. ..@ datanotation: chr "FLT4S"
.. .. ..@ byteorder : chr "little"
.. .. ..@ nodatavalue : num NaN
.. .. ..@ NAchanged : logi FALSE
.. .. ..@ nbands : int 365
.. .. ..@ bandorder : chr "BIL"
.. .. ..@ offset : int 0
.. .. ..@ toptobottom : logi TRUE
.. .. ..@ blockrows : int 0
.. .. ..@ blockcols : int 0
.. .. ..@ driver : chr "netcdf"
.. .. ..@ open : logi FALSE
..@ data :Formal class '.MultipleRasterData' [package "raster"] with 14 slots
.. .. ..@ values : logi[0 , 0 ]
.. .. ..@ offset : num 0
.. .. ..@ gain : num 1
.. .. ..@ inmemory : logi FALSE
.. .. ..@ fromdisk : logi TRUE
.. .. ..@ nlayers : int 365
.. .. ..@ dropped : NULL
.. .. ..@ isfactor : logi FALSE
.. .. ..@ attributes: list()
.. .. ..@ haveminmax: logi FALSE
.. .. ..@ min : num [1:365] Inf Inf Inf Inf Inf ...
.. .. ..@ max : num [1:365] -Inf -Inf -Inf -Inf -Inf ...
.. .. ..@ unit : chr ""
.. .. ..@ names : chr [1:365] "X1" "X2" "X3" "X4" ...
..@ legend :Formal class '.RasterLegend' [package "raster"] with 5 slots
.. .. ..@ type : chr(0)
.. .. ..@ values : logi(0)
.. .. ..@ color : logi(0)
.. .. ..@ names : logi(0)
.. .. ..@ colortable: logi(0)
..@ title : chr "TMIN"
..@ extent :Formal class 'Extent' [package "raster"] with 4 slots
.. .. ..@ xmin: num 0.5
.. .. ..@ xmax: num 348
.. .. ..@ ymin: num 0.5
.. .. ..@ ymax: num 328
..@ rotated : logi FALSE
..@ rotation:Formal class '.Rotation' [package "raster"] with 2 slots
.. .. ..@ geotrans: num(0)
.. .. ..@ transfun:function ()
..@ ncols : int 348
..@ nrows : int 327
..@ crs :Formal class 'CRS' [package "sp"] with 1 slot
.. .. ..@ projargs: chr NA
..@ history : list()
..@ z :List of 1
.. ..$ : int [1:365] 1 2 3 4 5 6 7 8 9 10 ...
str(d03)
Formal class 'RasterBrick' [package "raster"] with 12 slots
..@ file :Formal class '.RasterFile' [package "raster"] with 13 slots
.. .. ..@ name : chr "/Users/charlesbecker/Desktop/Data/Project Data/Shiny/WY2003_yearly_stats.nc"
.. .. ..@ datanotation: chr "FLT4S"
.. .. ..@ byteorder : chr "little"
.. .. ..@ nodatavalue : num NaN
.. .. ..@ NAchanged : logi FALSE
.. .. ..@ nbands : int 365
.. .. ..@ bandorder : chr "BIL"
.. .. ..@ offset : int 0
.. .. ..@ toptobottom : logi TRUE
.. .. ..@ blockrows : int 0
.. .. ..@ blockcols : int 0
.. .. ..@ driver : chr "netcdf"
.. .. ..@ open : logi FALSE
..@ data :Formal class '.MultipleRasterData' [package "raster"] with 14 slots
.. .. ..@ values : logi[0 , 0 ]
.. .. ..@ offset : num 0
.. .. ..@ gain : num 1
.. .. ..@ inmemory : logi FALSE
.. .. ..@ fromdisk : logi TRUE
.. .. ..@ nlayers : int 365
.. .. ..@ dropped : NULL
.. .. ..@ isfactor : logi FALSE
.. .. ..@ attributes: list()
.. .. ..@ haveminmax: logi FALSE
.. .. ..@ min : num [1:365] Inf Inf Inf Inf Inf ...
.. .. ..@ max : num [1:365] -Inf -Inf -Inf -Inf -Inf ...
.. .. ..@ unit : chr "K"
.. .. ..@ names : chr [1:365] "X1" "X2" "X3" "X4" ...
..@ legend :Formal class '.RasterLegend' [package "raster"] with 5 slots
.. .. ..@ type : chr(0)
.. .. ..@ values : logi(0)
.. .. ..@ color : logi(0)
.. .. ..@ names : logi(0)
.. .. ..@ colortable: logi(0)
..@ title : chr "TMIN"
..@ extent :Formal class 'Extent' [package "raster"] with 4 slots
.. .. ..@ xmin: num 0.5
.. .. ..@ xmax: num 348
.. .. ..@ ymin: num 0.5
.. .. ..@ ymax: num 328
..@ rotated : logi FALSE
..@ rotation:Formal class '.Rotation' [package "raster"] with 2 slots
.. .. ..@ geotrans: num(0)
.. .. ..@ transfun:function ()
..@ ncols : int 348
..@ nrows : int 327
..@ crs :Formal class 'CRS' [package "sp"] with 1 slot
.. .. ..@ projargs: chr NA
..@ history : list()
..@ z :List of 1
.. ..$ : int [1:365] 1 2 3 4 5 6 7 8 9 10 ...
system.time(sum(d97[[1:365]]))
user system elapsed
5.569 2.219 8.048
system.time(sum(masked_97[[1:365]]))
user system elapsed
11.887 2.342 14.569
system.time(sum(d03[[1:365]]))
user system elapsed
22.253 1.772 24.879
The most likely difference is that data in your new netCDF file is now compressed differently. Two forms of compression are common with netCDF files:
int16
via a formula like scale_factor * values + add_offset
. If you don't slice or manipulate your variables, xarray will preserve compression setting via the encoding
attribute, but this is generally dropped by xarray operations. See the xarray docs on reading/writing encoded data for more details.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.