简体   繁体   English

"R:向空数据框添加行时丢失列名"

[英]R: losing column names when adding rows to an empty data frame

I am just starting with R and encountered a strange behaviour: when inserting the first row in an empty data frame, the original column names get lost.我刚从 R 开始,遇到了一个奇怪的行为:在空数据框中插入第一行时,原始列名会丢失。

example:例子:

a<-data.frame(one = numeric(0), two = numeric(0))
a
#[1] one two
#<0 rows> (or 0-length row.names)
names(a)
#[1] "one" "two"
a<-rbind(a, c(5,6))
a
#  X5 X6
#1  5  6
names(a)
#[1] "X5" "X6"

The rbind help pages specifies that : rbind帮助页面指定:

For 'cbind' ('rbind'), vectors of zero length (including 'NULL') are ignored unless the result would have zero rows (columns), for S compatibility. 对于'cbind'('rbind'),除非结果为零行(列),否则忽略零长度(包括'NULL')的向量,以实现S兼容性。 (Zero-extent matrices do not occur in S3 and are not ignored in R.) (零范围矩阵不会出现在S3中,并且在R中不会被忽略)

So, in fact, a is ignored in your rbind instruction. 所以,事实上,在你的rbind指令中忽略了a Not totally ignored, it seems, because as it is a data frame the rbind function is called as rbind.data.frame : 看起来并没有完全忽略,因为它是一个数据框, rbind函数被称为rbind.data.frame

rbind.data.frame(c(5,6))
#  X5 X6
#1  5  6

Maybe one way to insert the row could be : 也许插入行的一种方法可能是:

a[nrow(a)+1,] <- c(5,6)
a
#  one two
#1   5   6

But there may be a better way to do it depending on your code. 但根据您的代码,可能有更好的方法。

was almost surrendering to this issue. 几乎屈服于这个问题。

1) create data frame with stringsAsFactor set to FALSE or you run straight into the next issue 1)创建数据框,其中stringsAsFactor设置为FALSE或者直接进入下一个问题

2) don't use rbind - no idea why on earth it is messing up the column names. 2)不要使用rbind - 不知道为什么它在搞乱列名。 simply do it this way: 简单地这样做:

df[nrow(df)+1,] <- c("d","gsgsgd",4)

df <- data.frame(a = character(0), b=character(0), c=numeric(0))

df[nrow(df)+1,] <- c("d","gsgsgd",4)

#Warnmeldungen:
#1: In `[<-.factor`(`*tmp*`, iseq, value = "d") :
#  invalid factor level, NAs generated
#2: In `[<-.factor`(`*tmp*`, iseq, value = "gsgsgd") :
#  invalid factor level, NAs generated

df <- data.frame(a = character(0), b=character(0), c=numeric(0), stringsAsFactors=F)

df[nrow(df)+1,] <- c("d","gsgsgd",4)

df
#  a      b c
#1 d gsgsgd 4

Workaround would be: 解决方法是:

a <- rbind(a, data.frame(one = 5, two = 6))

?rbind states that merging objects demands matching names: ?rbind声明合并对象需要匹配名称:

It then takes the classes of the columns from the first data frame, and matches columns by name (rather than by position) 然后它从第一个数据框中获取列的类,并按名称(而不是按位置)匹配列

FWIW, an alternative design might have your functions building vectors for the two columns, instead of rbinding to a data frame: FWIW,一种替代设计可能会让您的函数为两列构建向量,而不是重新绑定到数据框:

ones <- c()
twos <- c()

Modify the vectors in your functions: 修改函数中的向量:

ones <- append(ones, 5)
twos <- append(twos, 6)

Repeat as needed, then create your data.frame in one go: 根据需要重复,然后一次创建data.frame:

a <- data.frame(one=ones, two=twos)

One way to make this work generically and with the least amount of re-typing the column names is the following. 一般来说,使用最少量的重新键入列名称的方法是以下方法。 This method doesn't require hacking the NA or 0. 此方法不需要黑客攻击NA或0。

rs <- data.frame(i=numeric(), square=numeric(), cube=numeric())
for (i in 1:4) {
    calc <- c(i, i^2, i^3)
    # append calc to rs
    names(calc) <- names(rs)
    rs <- rbind(rs, as.list(calc))
}

rs will have the correct names rs将具有正确的名称

> rs
    i square cube
1   1      1    1
2   2      4    8
3   3      9   27
4   4     16   64
> 

Another way to do this more cleanly is to use data.table: 另一种更干净的方法是使用data.table:

> df <- data.frame(a=numeric(0), b=numeric(0))
> rbind(df, list(1,2)) # column names are messed up
>   X1 X2
> 1  1  2

> df <- data.table(a=numeric(0), b=numeric(0))
> rbind(df, list(1,2)) # column names are preserved
   a b
1: 1 2

Notice that a data.table is also a data.frame. 请注意,data.table也是data.frame。

> class(df)
"data.table" "data.frame"

You can do this: 你可以这样做:

give one row to the initial data frame 给初始数据框一行

 df=data.frame(matrix(nrow=1,ncol=length(newrow))

add your new row and take out the NAS 添加新行并取出NAS

newdf=na.omit(rbind(newrow,df))

but watch out that your newrow does not have NAs or it will be erased too. 但要注意你的新手没有NA,否则它也会被删除。

Cheers Agus 干杯阿古斯

I use the following solution to add a row to an empty data frame: 我使用以下解决方案向空数据框添加一行:

d_dataset <- 
  data.frame(
    variable = character(),
    before = numeric(),
    after = numeric(),
    stringsAsFactors = FALSE)

d_dataset <- 
  rbind(
    d_dataset,
      data.frame(
        variable = "test",
        before = 9,
        after = 12,
        stringsAsFactors = FALSE))  

print(d_dataset)

variable before after  
1     test      9    12

HTH. HTH。

Kind regards 亲切的问候

Georg 乔治·

Researching this venerable R annoyance brought me to this page.研究这个古老的 R 烦恼将我带到了这个页面。 I wanted to add a bit more explanation to Georg's excellent answer ( https:\/\/stackoverflow.com\/a\/41609844\/2757825<\/a> ), which not only solves the problem raised by the OP (losing field names) but also prevents the unwanted conversion of all fields to factors.我想为 Georg 的出色答案( https:\/\/stackoverflow.com\/a\/41609844\/2757825<\/a> )添加更多解释,这不仅解决了 OP 提出的问题(丢失字段名称),而且还防止了不必要的转换所有领域的因素。 For me, those two problems go together.对我来说,这两个问题是一起出现的。 I wanted a solution in base R that doesn't involve writing extra code but preserves the two distinct operations: define the data frame, append the row(s)--which is what Georg's answer provides.我想要一个不涉及编写额外代码但保留两个不同操作的基本 R 解决方案:定义数据框,附加行 - 这是 Georg 的答案提供的。

The first two examples below illustrate the problems and the third and fourth show Georg's solution.下面的前两个示例说明了问题,第三个和第四个示例显示了 Georg 的解决方案。

Example 1: Append the new row as vector with rbind示例 1:使用 rbind 将新行作为向量附加

Instead of constructing the data.frame with numeric(0) I use as.numeric(0) . 而不是使用numeric(0)构造data.frame我使用as.numeric(0)

a<-data.frame(one=as.numeric(0), two=as.numeric(0))

This creates an extra initial row 这会创建一个额外的初始行

a
#    one two
#1   0   0

Bind the additional rows 绑定其他行

a<-rbind(a,c(5,6))
a
#    one two
#1   0   0
#2   5   6

Then use negative indexing to remove the first (bogus) row 然后使用负索引删除第一行(伪造)行

a<-a[-1,]
a

#    one two
#2   5   6

Note: it messes up the index (far left). 注意:它弄乱了索引(最左边)。 I haven't figured out how to prevent that (anyone else?), but most of the time it probably doesn't matter. 我还没弄明白如何防止这种情况(其他人?),但大多数时候它可能并不重要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM