繁体   English   中英

根据 col A 行 N 和 col B 行 N+1 的值之间的差异插入行

[英]Insert rows based on difference between value from col A row N and col B row N+1

我有一个示例数据如下(我使用R ):

A   B   C
1   2   Background
3   19  Background
26  41  person
43  69  person
83  97  Background
107 129 Background
132 179 Background
189 235 Background
243 258 Background
261 279 person

我想添加行,其中 col A 行 N+1 和 col B 行 N > 1 和行 C 之间的差异获得标签(例如“其他”)。 所以数据看起来像这样:

A   B   C
1   2   Background
3   19  Background
20  25  other
26  41  person
43  69  person
70  82  other
83  97  Background
98  106 other
107 129 Background
130 131 other
132 179 Background
180 188 other
189 235 Background
236 242 other
243 258 Background
259 260 other
261 279 person

谢谢!

这是使用基数 R 的一种方法,假设第 4 行A值是 42(而不是 43)。

#Find out row indices where difference of A value for N + 1 row and 
#B value in N row is not equal to 1.
inds <- which(tail(df$A, -1) - head(df$B, -1) != 1)
#Create a dataframe which we want to insert in the current dataframe
#using values from A and B column and inds indices
include_df <- data.frame(A = df$B[inds] + 1,B = df$A[inds + 1] - 1, C = 'other', 
               stringsAsFactors = FALSE)
#Repeat rows at inds to make space to insert new rows
df <- df[sort(c(seq_len(nrow(df)), inds)), ]
#Insert the new rows in their respective position
df[inds + seq_along(inds), ] <- include_df
#Remove row names
row.names(df) <- NULL

df
#     A   B          C
#1    1   2 Background
#2    3  19 Background
#3   20  25      other
#4   26  41     person
#5   42  69     person
#6   70  82      other
#7   83  97 Background
#8   98 106      other
#9  107 129 Background
#10 130 131      other
#11 132 179 Background
#12 180 188      other
#13 189 235 Background
#14 236 242      other
#15 243 258 Background
#16 259 260      other
#17 261 279     person

数据

df <- structure(list(A = c(1, 3, 26, 42, 83, 107, 132, 189, 243, 261
), B = c(2L, 19L, 41L, 69L, 97L, 129L, 179L, 235L, 258L, 279L
), C = c("Background", "Background", "person", "person", "Background", 
"Background", "Background", "Background", "Background", "person"
)), row.names = c(NA, -10L), class = "data.frame")

使用与 Ronak 相同的数据编辑的data.table选项:

ix <- DT[shift(A, -1L) - B > 1L, which=TRUE]
rbindlist(list(DT,
    data.table(A=DT$B[ix]+1L, B=DT$A[ix+1L]-1L, C="other")))[order(A)]

输出:

      A   B          C
 1:   1   2 Background
 2:   3  19 Background
 3:  20  25      other
 4:  26  41     person
 5:  42  69     person
 6:  70  82      other
 7:  83  97 Background
 8:  98 106      other
 9: 107 129 Background
10: 130 131      other
11: 132 179 Background
12: 180 188      other
13: 189 235 Background
14: 236 242      other
15: 243 258 Background
16: 259 260      other
17: 261 279     person

数据:

library(data.table)
DT <- fread("A   B   C
1   2   Background
3   19  Background
26  41  person
42  69  person
83  97  Background
107 129 Background
132 179 Background
189 235 Background
243 258 Background
261 279 person")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM