简体   繁体   English

R-从data.frame中删除行

[英]R - deleting rows from data.frame

I am very new to r an programming and have a basic question (my first one on stackoverflow :) ) I want to delete some rows from a data.frame and use an if-statement on that account. 我对编程非常陌生,有一个基本问题(我的第一个关于stackoverflow的问题:))我想从data.frame中删除一些行,并在该帐户上使用if语句。 My code is running but it is unfortunately not deleting the correct rows but instead every second row of my dataframe I think. 我的代码正在运行,但是很遗憾,它没有删除正确的行,而是删除了我认为的数据框的第二行。

"myDataVergleich" is the name of my data.frame. “ myDataVergleich”是我的data.frame的名称。 "myData$QUESTNNR" is the column by which is decided whether the row is supposed to stay in the dataframe or not. “ myData $ QUESTNNR”是确定行是否应该保留在数据帧中的列。

for(i in 1:nrow(myDataVergleich))
  {if(myData$QUESTNNR[i] != "t0_mathe" | myData$QUESTNNR[i] != "t0_bio" | myData$QUESTNNR[i] != "t0_allg2" |
     myData$QUESTNNR[i] != "t7_mathe_Version1" | myData$QUESTNNR[i] != "t7_bio_Version1") 
    {myDataVergleich <- myDataVergleich[-c(i),] }}

What am I doing wrong? 我究竟做错了什么?

Welcome to stack overflow and to R. I think your intuition is correct but there are some issues. 欢迎使用堆栈溢出和R。我认为您的直觉是正确的,但存在一些问题。 First, you say your data is called 'myDataVergleich' but inside your loop you are accessing 'myData'. 首先,您说您的数据称为“ myDataVergleich”,但在循环中,您正在访问“ myData”。 So you might need to change 'myData$QUESTNNR[i]' to 'myDataVergleich$QUESTNNR[i]' in the loop. 因此,您可能需要在循环中将“ myData $ QUESTNNR [i]”更改为“ myDataVergleich $ QUESTNNR [i]”。

A great thing about R is that there are solutions people have figured out already for many problems, sub-setting a data frame by a condition is one of them. 关于R的一个很棒的事情是,人们已经解决了许多问题的解决方案,按条件子集设置数据帧就是其中之一。 You should look into the tidyverse family of packages, especially dplyr in this case. 您应该研究tidyverse软件包家族,在这种情况下尤其是dplyr。

install.packages('dplyr')
install.packages('magrittr')

If you want to keep the rows with these strings this code will work 如果您想使用这些字符串保留行,则此代码将起作用

library(dplyr)
library(magrittr)
strings <- c(
  "t0_mathe", "t0_bio", "t0_allg2", "t7_mathe_Version1", "t7_bio_Version1"
)
filtered_data <- myDataVergleich %>%
  dplyr::filter(QUESTNNR %in% strings)

If you want to keep the rows that don't contain these strings this code will work 如果要保留不包含这些字符串的行,则此代码将起作用

library(dplyr)
library(magrittr)
strings <- c(
  "t0_mathe", "t0_bio", "t0_allg2", "t7_mathe_Version1", "t7_bio_Version1"
)
filtered_data <- myDataVergleich %>%
  dplyr::filter(!QUESTNNR %in% strings)

Hope that helps 希望能有所帮助

I would have to know the error, QUESTNNR %in% strings returns a TRUE or FALSE and adding the ! 我必须知道错误,QUESTNNR%in%字符串返回TRUE或FALSE并添加! just returns the opposite, so that should word fine. 只是返回相反的意思,所以应该没问题。 You can detect part of a string with str_detect from the 'stringr' package. 您可以使用'stringr'包中的str_detect检测字符串的一部分。

library(dplyr)
library(stringr)
library(tibble)
library(magrittr)
df <- tibble(x = c('h', 'e', 'l', 'l', '0')) 
df %>% dplyr::filter(str_detect(x, 'l'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM