简体   繁体   中英

Transforming data in specific rows and columns inside a data frame in R

I have a data frame

Name   M0  M1 M2 M3 M4 M5  
ABC    4   4  3  4  33 22
XYZ    3   5  6  22  1 33
RTF    3   7  33 2   4  0
hdj    32  3  9  3   1  3
 .
 .
Tim    4   4   0  3  3  1

I would like to add NA or Null values based on the following patern. 2nd row all column values to remain. But 3rd row the last colum value to be made NULL or NA. 4th row last two column values

Name   M0  M1 M2 M3 M4 M5  
ABC    4   4  3  4  33 22
XYZ    3   5  6  22  1 NA
RTF    3   7  33 2  NA NA
hdj    32  3  9  NA NA  NA
.
.
tim    3   NA NA NA NA NA

This is my attempt

    # getting the maximum rows and cols
    rows<-nrow(df)
    cols<-ncol(df)

    for (i in 3:rows) {

  df[i,cols:cols-i-1]<-NULL  

}

Apology for how basic this is. But just one of those days! It would useful to know the multiple ways of achieving this. Personally a fan dplyr package.

One option would be to create a matrix of 1s with the same dimension as the numeric columns in 'df1', change the lower triangulare elements in that to NA , looped by row, reverse the elemens ( rev ), multiply with the numeric columns and assign the output. As any number multiplied by NA returns NA, this would be useful.

 m1 <- matrix(1, nrow=nrow(df1), ncol=ncol(df1)-1)
 m1[lower.tri(m1)] <- NA
 df1[-1] <- df1[-1]*apply(m1, 1, rev)
 df1
 #  Name M0 M1 M2 M3 M4 M5
 #1  ABC  4  4  3  4 33 22
 #2  XYZ  3  5  6 22  1 NA
 #3  RTF  3  7 33  2 NA NA
 #4  hdj 32  3  9 NA NA NA
 #5  zdf 42  1 NA NA NA NA
 #6  Tim  4 NA NA NA NA NA

Or we can use the shift function from data.table . We use the type='lead' option to fill NA elements on a vector of 1s, rbind the list elements and multiply as in the earlier solution.

 library(data.table)
 df1[-1] <- df1[-1]*do.call(rbind,shift(rep(1, ncol(df1)-1), 
                            seq(ncol(df1)-1)-1, type='lead'))

data

df1 <- structure(list(Name = c("ABC", "XYZ", "RTF", "hdj", 
 "zdf", "Tim"
 ), M0 = c(4L, 3L, 3L, 32L, 42L, 4L), M1 = c(4L, 5L, 7L, 3L, 
 1L, 
 4L), M2 = c(3L, 6L, 33L, 9L, 7L, 0L), M3 = c(4L, 22L, 2L, 3L, 
 8L, 3L), M4 = c(33L, 1L, 4L, 1L, 9L, 3L), M5 = c(22L, 33L,
 0L, 
 3L, 5L, 1L)), .Names = c("Name", "M0", "M1", "M2", "M3",
 "M4", 
 "M5"), class = "data.frame", row.names = c(NA, -6L))

I'm not sure if I understood but I think that you expect something like this (as df take your data frame):

df <- matrix(1, ncol=6, nrow=7)

ncol(df) -> ile_kolumn
ile_kolumn:3 -> ktore

if(nrow(df)-1-length(ktore)>0){
    ktore <- c(ktore, rep(ktore[length(ktore)], nrow(df)-1-length(ktore)))    
} 

for(i in 2:nrow(df)){
    df[i, ile_kolumn:ktore[i-1]] <- NA
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM