简体   繁体   English

绑定具有不同行数的列

[英]bind columns with different number of rows

I want to create iteration that takes a list (which is column of another dataframe) and add it to the current data frame as column.我想创建一个迭代,它接受一个列表(它是另一个数据帧的列)并将其作为列添加到当前数据帧中。 but the length of the columns are not equal.但列的长度不相等。 So, I want to generate NA as unmatched rows.所以,我想将 NA 生成为不匹配的行。

seq_actions=as.data.frame(x = NA)
for(i in 1:20){
  temp_seq=another_df$c1[some conditions]  
  seq_actions=cbind(temp_seq,seq_actions)
}

to simplify, lets say i have为了简化,假设我有

df
1  3
3  4
2  2

adding the list of 5,6 as new column to df, so I want:将 5,6 的列表作为新列添加到 df,所以我想要:

 df
    1  3  5
    3  4  6
    2  2  NA

another adding list is 7 7 7 8, so my df will be:另一个添加列表是 7 7 7 8,所以我的 df 将是:

df
   1  3  5  7
   3  4  6  7
   2  2  NA 7
   NA NA NA 8

How can I do it?我该怎么做?

Here's one way.这是一种方法。 The merge function by design will add NA values whenever you combine data frames and no match is found (eg, if you have fewer values in 1 data frame than the other data frame).每当您合并数据框且未找到匹配项时(例如,如果 1 个数据框中的值少于另一个数据框中的值),设计的合并功能将添加 NA 值。

If you assume that you're matching your data frames (what rows go together) based on the row number, just output the row number as a column in your data frames.如果您假设您正在根据行号匹配数据框(哪些行放在一起),只需将行号输出为数据框中的一列。 Then merge on that column.然后在该列上合并。 Merge will automatically add the NA values you want and deal with the fact that the data frames have different numbers of rows. Merge 将自动添加您想要的 NA 值并处理数据框具有不同行数的事实。

#test data frame 1
a <- c(1, 3, 2)
b <- c(3, 4, 2)
dat <- as.data.frame(cbind(a, b))

#test data frame 2 (this one has fewer rows than the first data frame)
c <- c(5, 6)
dat.new <- as.data.frame(c)

#add column to each data frame with row number
dat$number <- row.names(dat)
dat.new$number <- row.names(dat.new)

#merge data frames
#"all = TRUE" will mean that NA values will be added whenever there is no match 
finaldata <- merge(dat, dat.new, by = "number", all = TRUE)

If you know the maximum possible size of df, and the total number of columns you want to append, you can create df in advance with all NA values and fill a column in based on its length.如果您知道 df 的最大可能大小以及要追加的总列数,则可以使用所有 NA 值提前创建 df 并根据其长度填充一列。 This would leave everything after its length still NA.这将在其长度之后的所有内容仍然为 NA。

eg例如

max_col_num <- 20 
max_col_size <- 10 #This could be the number of rows in the largest dataframe you have

df <- as.data.frame(matrix(ncol = max_col_num, nrow = max_col_size))

for(i in 1:20){
      temp_seq=another_df$c1[some conditions] 
      df[c(1:length(temp_seq), i] <- temp_seq
}

This would only work if you new the total possible number of rows and columns.这仅在您更新可能的总行数和列数时才有效。

I think the best could be to write a custom function which is based on nrow of data frame and length of vector/list.我认为最好的方法是编写一个基于数据帧nrow和向量/列表length的自定义函数。

Once such function can be written as:一旦这样的函数可以写成:

#Function to add vector as column
addToDF <- function(df, v){
 nRow <- nrow(df)
 lngth <- length(v)
 if(nRow > lngth){
   length(v) <- nRow
 }else if(nRow < lngth){
   df[(nRow+1):lngth, ] <- NA
 }
 cbind(df,v)
}

Let's test above function with data.frame provided by OP.让我们用 OP 提供的 data.frame 测试上述功能。

df <- data.frame(A= c(1,3,2), B = c(3, 4, 2))

v <- c(5,6)

w <-c(7,7,8,9)

addToDF(df, v)
#   A B  v
# 1 1 3  5
# 2 3 4  6
# 3 2 2 NA

addToDF(df, w)
#    A  B v
# 1  1  3 7
# 2  3  4 7
# 3  2  2 8
# 4 NA NA 9

Following MKRs response, if you want to to add a specific name to the new added column, you can try:根据 MKR 的响应,如果您想在新添加的列中添加特定名称,您可以尝试:


addToDF <- function(df, v, col_name){
  nRow <- nrow(df)
  lngth <- length(v)
  if(nRow > lngth){
    length(v) <- nRow
  }else if(nRow < lngth){
    df[(nRow+1):lngth, ] <- NA
  }
  df_new<-cbind(df,v)
  colnames(df_new)[ncol(df_new)]=col_name
  return(df_new)
}

where col_name is the new of the added column.其中col_name是新添加的列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM