简体   繁体   English

在小标题中添加两列并将总和保存到第三列使第三列成为 dataframe

[英]Adding two columns in a tibble and saving the sum to third column is making the third column a dataframe

I am working on generating a report, upon trying to write the tibble using xlsx package's write.xlsx, it gave an error (even after me specifying as.data.frame(tibble) in write.xlsx).我正在生成一份报告,在尝试使用 xlsx 包的 write.xlsx 编写 tibble 时,它给出了一个错误(即使在我在 write.xlsx 中指定 as.data.frame(tibble) 之后)。 Upon checking the tibble, I realized that when I added multiple columns and stored the result in another column in the tibble, the total column has become a dataframe.检查 tibble 后,我意识到当我添加多个列并将结果存储在 tibble 的另一列中时,总列已变为 dataframe。

Example:例子:

> marks <- tibble(math = c(90,90,85,90),
+                 physics = c(90,85,95,80),
+                 Total = c(rep(NA,4)))
> marks
# A tibble: 4 x 3
   math physics Total
  <dbl>   <dbl> <lgl>
1    90      90 NA   
2    90      85 NA   
3    85      95 NA   
4    90      80 NA   
> class(marks)
[1] "tbl_df"     "tbl"        "data.frame"
> str(marks)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   4 obs. of  3 variables:
 $ math   : num  90 90 85 90
 $ physics: num  90 85 95 80
 $ Total  : logi  NA NA NA NA
> marks$Total <- marks[,1] + marks[,2]
> str(marks)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   4 obs. of  3 variables:
 $ math   : num  90 90 85 90
 $ physics: num  90 85 95 80
 $ Total  :'data.frame':    4 obs. of  1 variable:
  ..$ math: num  180 175 180 170
> 

As we can see above, I thought I can use vectorized operations of R but the "Total" column has changed to dataframe after summing up two columns and storing the result in Total column.正如我们在上面看到的,我认为我可以使用 R 的矢量化操作,但是在将两列相加并将结果存储在 Total 列中之后,“Total”列已更改为 dataframe。

Could someone let me know why this is happening, also, how to perform the above operation.有人可以让我知道为什么会这样,以及如何执行上述操作。

Edited: OK seems like because tibble doesn't drop dimension, it was not like adding two vectors.编辑:OK 似乎是因为 tibble 不会降低维度,它不像添加两个向量。

I think this is a result of the fact that by defaul tibbles don't drop the 2nd dimension when you access part of them with [] , whereas dataframes do.我认为这是因为当您使用[]访问其中的一部分时,默认情况下 tibbles 不会删除第二维,而数据帧会这样做。 Compare:相比:

> marks[, 1]
# A tibble: 4 x 1
   math
  <dbl>
1    90
2    90
3    85
4    90
> marks_df = as.data.frame(marks)
> marks_df[ , 1]
[1] 90 90 85 90

So marks[, 1] + marks[, 2] is adding a tibble to a tibble and the result is a tibble.所以marks[, 1] + marks[, 2]正在向一个小标题添加一个小标题,结果是一个小标题。

To avoid this, you can either drop the 2nd dimension explicitly, or just use the column names:为避免这种情况,您可以显式删除第二维,或者只使用列名:

marks$Total <- marks[,1, drop = TRUE] + marks[, 2, drop = TRUE]
marks$Total <- marks$math + marks$physics

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM