简体   繁体   English

R过滤数据框创建空行

[英]R filtering a data frame creates empty rows

I have a csv file imported into R where I want to filter out rows that don't include a certain letter in one of the columns. 我有一个导入到R中的csv文件,我想过滤掉其中一列中不包含某个字母的行。 I've tried both subset and dplyr and both of them produce the column names but turn up empty rows. 我试过了subset和dplyr,它们都产生列名,但出现空行。 I know that the column contains the letter I'm looking for, so I don't understand why the rows are empty. 我知道该列包含我要查找的字母,所以我不明白为什么行为空。 This is what I get when I call the head function on my data set: 这是在数据集上调用head函数时得到的:

head(dbbt)
   X.Focal_DB. X.Effect_size. X.Variance.            X.Study. X.BT.
1         165        -0.1931   0.0132000      'Agrawal_1998'   'B'
2          21        -1.4414   0.1938000      'Agrawal_1999'   'B'
3          19        -3.1642   0.2402559      'Agrawal_1999'   'B'
4          19        -1.0272   0.0731000 'Agrawal_1999-2000'   'B'

(the colnames imported with an X. . around them, and I can't figure out why- they don't contain any forbidden characters) (以X开头的大名,在它们周围,我不知道为什么-它们不包含任何禁止的字符)

when I try: 当我尝试:

 dbbtjustb <- subset(dbbt, X.BT. == "B")

I get: 我得到:

head(dbbtjustb)
[1] X.Focal_DB.    X.Effect_size. X.Variance.    X.Study.      
[5] X.BT.         
<0 rows> (or 0-length row.names)

when I tried: 当我尝试:

dbbt %>%
    select(X.F_DietBreadth., X.Effect_size., X.variance., X.Bottom_up_top_down.) %>%
    filter(X.Bottom_up_top_down. == "B")

I got the same thing. 我有同样的事情。 Please help! 请帮忙!

edit: structure (this is not my original data set, because that was huge) 编辑:结构(这不是我的原始数据集,因为那是巨大的)

structure(list(X.Focal_DB. = c(31L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 8L, 1L, 1L, 1L, 1L, 2L, 6L, 126L, 22L, 126L, 27L), X.Effect_size. = c(-0.0951, 
0.4797, -0.1705, 0.713, -0.2661, -0.6614, -1.5941, -2.1892, -0.2133, 
-0.2183, -0.0275, -0.268, -5.0499, -3.2934, -0.9469, 0.6316, 
2.236, 0.1724, 1.6541, -1.5496), X.Variance. = c(0.1006223807, 
0.0468390134, 0.0124, 0.014674063, 0.1385, 0.15, 0.3866, 0.4706, 
0.1025, 0.3688, 0.1354, 0.1444, 0.1641758772, 0.0849100448, 0.0783, 
0.040866755, 0.1814043974, 0.0535, 0.1503, 0.0999), X.Study. = structure(c(1L, 
2L, 3L, 4L, 6L, 6L, 5L, 5L, 7L, 8L, 9L, 9L, 10L, 10L, 11L, 12L, 
13L, 14L, 15L, 16L), .Label = c("'Bergeson & Messina_1997'", 
"'Bergeson & Messina_1997- 1998'", "'Cronin & Abrahamson_1999'", 
"'Dechert & Ulber_2004'", "'Denno_et_al. 2000'", "'Denno & Roderick_1992'", 
"'Dorn_et_al. 2003'", "'Evans & England_1996'", "'Ferrenberg & Denno_2003'", 
"'Finch & Jones_1989'", "'Floate & Whitham_1994'", "'Formusoh_et_al. 1992'", 
"'Forrest_1971'", "'Fritz_1983'", "'Gange & Brown_1989'", "'Gianoli_2000'"
), class = "factor"), X.BT. = structure(c(2L, 3L, 2L, 2L, 2L, 
2L, 4L, 2L, 3L, 4L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("'ants'", 
"'B'", "'NA'", "'T'"), class = "factor")), .Names = c("X.Focal_DB.", 
"X.Effect_size.", "X.Variance.", "X.Study.", "X.BT."), class = "data.frame", row.names = c(NA, 
-20L))

The actual entry is 'B' , which means you need to subset by "'B'" . 实际的输入是'B' ,这意味着您需要用"'B'"进行子集化。

> unique(df$X.BT.)
[1] 'B'    'NA'   'T'    'ants'

using dplyr 使用dplyr

> filter(df, X.BT. == "'B'")
   X.Focal_DB. X.Effect_size. X.Variance.                   X.Study. X.BT.
1           31        -0.0951  0.10062238  'Bergeson & Messina_1997'   'B'
2            1        -0.1705  0.01240000 'Cronin & Abrahamson_1999'   'B'
3            1         0.7130  0.01467406     'Dechert & Ulber_2004'   'B'
4            1        -0.2661  0.13850000    'Denno & Roderick_1992'   'B'
5            1        -0.6614  0.15000000    'Denno & Roderick_1992'   'B'
6            1        -2.1892  0.47060000        'Denno_et_al. 2000'   'B'
7            1        -0.0275  0.13540000  'Ferrenberg & Denno_2003'   'B'
8            1        -0.2680  0.14440000  'Ferrenberg & Denno_2003'   'B'
9            1        -5.0499  0.16417588       'Finch & Jones_1989'   'B'
10           1        -3.2934  0.08491004       'Finch & Jones_1989'   'B'
11           6         0.6316  0.04086675     'Formusoh_et_al. 1992'   'B'
12         126         2.2360  0.18140440             'Forrest_1971'   'B'
13         126         1.6541  0.15030000       'Gange & Brown_1989'   'B'
14          27        -1.5496  0.09990000             'Gianoli_2000'   'B'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM