[英]Suppress large output to R console
How can I make R check whether an object is too large to print in the console? 如何让R检查对象是否太大而无法在控制台中打印? "Too large" here means larger than a user-defined value. “太大”是指大于用户定义的值。
Example: You have a list f_data
with two elements f_data$data
(a 100MB data.frame) and f_data$info
(for instance, a vector). 示例:您有一个包含两个元素的列表f_data
f_data$data
(一个100MB data.frame)和f_data$info
(例如一个向量)。 Assume you want to inspect the first few entries of the f_data$data
data.frame but you make a mistake and type head(f_data)
instead of head(f_data$data)
. 假设您要检查f_data$data
data.frame的前几个条目,但是您犯了一个错误并键入head(f_data)
而不是head(f_data$data)
。 R will try to print the whole content of f_data
to the console (which would take forever). R将尝试将f_data
的全部内容打印到控制台(这将永远花费)。 Is there somewhere an option that I can set in order to suppress the output of objects that are larger than let's say 1MB? 我可以在某处设置一个选项来抑制大于1MB的对象的输出吗?
Edit: Thank you guys for your help. 编辑:谢谢您的帮助。 After implementing the max.rows
option I realized that this gives indeed the desired output. 实施max.rows
选项后,我意识到这确实提供了所需的输出。 BUT the problem that the output takes very long to show up still persists. 但是输出需要很长时间才能显示出来的问题仍然存在。 I will give you a proper example below. 我会在下面给你一个适当的例子。
df_nrow=100000
df_ncol=100
#create list with first element being a large data.frame
#second element is a short vector
test_list=list(df=data.frame(matrix(rnorm(df_nrow*df_ncol),nrow=df_nrow,ncol=df_ncol)),
vec=1:110)
#only print the first 100 elements of an object
options(max.print=100)
#head correctly displays the first row of the data.frame
#BUT for some reason the output takes really long to show up in the console (~30sec)
head(test_list)
#let's try to see how long exactly
system.time(head(test_list))
# user system elapsed
# 0 0 0
#well, obviously system.time is not the proper tool to measure this
#the same problem if I just print the object to the console without using head
test_list$df
I assume that R performs some sort of analysis on the object being printed and this is what takes so long. 我假设R对要打印的对象执行某种分析,这花费了很长时间。
Edit 2: As per my comment below, I checked whether the problem persists if I use a matrix
instead of a data.frame
. 编辑2:根据下面的评论,如果使用matrix
而不是data.frame
,我检查问题是否仍然存在。
#create list with first element being a large MATRIX
test_list=list(mat=matrix(rnorm(df_nrow*df_ncol),nrow=df_nrow,ncol=df_ncol),vec=1:110)
#no problem
head(test_list)
#no problem
test_list$mat
Could it be that the output to the console is not really efficiently implemented for data.frame
objects? 可能不是为data.frame
对象真正有效地实现了控制台的输出吗?
I think there is no such option, but you can check the size of an object with object.size
and print it if is lower than a threshold (measure in bytes), for example: 我认为没有这样的选项,但是您可以使用object.size
检查对象的大小并在低于阈值(以字节为单位的度量)时打印它,例如:
print.small.objects <- function(x, threshold = 1e06, ...)
{
if (object.size(x) < threshold) {
print(x, ...)
} else {
cat(paste("too big object\n"))
print(object.size(x))
}
}
Here's an example that you could adjust up to 100MB. 这是一个示例,您最多可以调整100MB。 It basically only prints the first 6 rows and 5 columns if the object's size is above 8e5 bytes. 如果对象的大小大于8e5字节,则基本上只打印前6行和5列。 You could also turn this into a function and place it in your .Rprofile
您也可以将其转换为函数并将其放在.Rprofile
> lst <- list(data.frame(replicate(100, rnorm(1000))), 1:10)
> sapply(lst, object.size)
# [1] 810968 88
> lapply(lst, function(x){
if(object.size(x) > 8e5) head(x)[1:5] else x
})
#[[1]]
# X1 X2 X3 X4 X5
#1 0.3398235 -1.7290077 -0.35367971 0.09874918 -0.8562069
#2 0.2318548 -0.3415523 -0.38346083 -0.08333569 -1.1091982
#3 0.0714407 -1.4561768 0.50131914 -0.54899188 0.1652095
#4 -0.5170228 1.7343073 -0.05602883 0.87855313 0.4025590
#5 0.6962212 -0.3179930 0.28016057 1.05414456 -0.5172885
#6 0.9471200 1.4424843 -1.46323827 -0.78004192 -1.3611820
#
#[[2]]
# [1] 1 2 3 4 5 6 7 8 9 10
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.