简体   繁体   中英

R: Size of a file. What is the difference between file.info()$size and object.size()?

I want to know the size of a file in R . Should i use
file.info(pathtodata)$size or object.size(pathtodata) ?
(or a other solution?)
And what is the difference between them?

Thank you!

In general object.size() should be larger than a file's size on disk because R objects will have metadata associated with them that occupies additional memory - see Hadley's article here . On top of this, different object classes will have different memory footprints:

write.csv(
  matrix(1:1000),
  file="~/tmp/foo.csv",
  row.names=FALSE)
##
df <- read.csv(
  "~/tmp/foo.csv",
  stringsAsFactors=FALSE)
mat <- as.matrix(df)
##
R> file.info("~/tmp/foo.csv")$size
#[1] 3898
R> object.size(df)
#4672 bytes
R> object.size(mat)
#4464 bytes
R> file.info("~/tmp/foo.csv")$size
#[1] 3898
R> system("stat ~/tmp/foo.csv")
#  File: ‘/home/nr07/tmp/foo.csv’
#  Size: 3898       Blocks: 8          IO Block: 4096   regular file

In the above example, the data.frame occupies more memory than the matrix , even though they were constructed from the same underlying data; and both of these occupy more space than the file itself does on disk.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM