如何使用 num_range 选择在一个特定列中都包含相同前 4 位数字的行？（希望使用 dplyr/tidyverse）

Question

我的问题最好分为两部分：

我正在处理一个数据集，该数据集查看了许多国家/地区的林产品使用情况。 每行代表来自这些国家中任何一个国家的一个家庭（总共约 30 个）。 每个国家都有一个代码（4 位），但数据集中没有国家代码列。 您可以通过使用家庭 ID（“ghousehold”）来推断哪些家庭来自哪个国家/地区。 Ghousecode 是一个 7 位代码，前 4 位是国家代码。 例如，如果玻利维亚是国家代码：3024，那么玻利维亚的一个家庭可能是 3024105 或 3024999...

我想要一个代码来选择特定国家/地区的所有条目。 我正在使用 tidyverse，所以我想使用 select() 和 num_range() 但它没有用。 我没有收到错误消息，但是当我查看我的输出时，我可以看出它没有工作。 这是我当前的代码：

    #forest_use_tibble is a tibble with observations on forest usage from many countries
    #I selected a subset of the original file's variables. 

    forest_use_simpler <- select(forest_use_tibble, ghousecode, year, product, income, amount, unit)

    #take Bolivia, whose country ID is 3024. This means that each ghousecode that begins with 
     3024 is from Bolivia. 
    #but each ghousecode is 3024xxx with three other numbers after it.

    x = 3024
    Bolivia <- select(forest_use_simpler, num_range("x", 001:999), everything())

    #my goal: a new tibble/dataframe that has only the entries from Bolivia
    #there is no separate column for country ID, unfortunately.

有任何想法吗？

问题的第二部分：有没有办法只查询 num_range 的一列（即变量，在本例中为 ghousecode）？ 我上面的方式让我印象深刻，就像它会搜索forest_use_simple中的所有变量一样，所以如果数字3024出现在ghousecode以外的其他地方，它就有可能包括另一个国家的家庭。

谢谢！

（注意：我也试过直接在 x 无效的地方输入 3024。再次感谢所有帮助。）

Answer 1

如果ghousecode的格式始终为 7 位数字，那么这样的事情怎么样？

library(tidyverse)

df <-
  tibble(
    ghousecode = c(2039434, 3024105),
    year = c(2019, 2019)
  )

df %>% 
  mutate(country_code = floor(ghousecode / 1000)) %>% 
  filter(country_code == 3024)

select选择列，而filter选择行。

如何使用 num_range 选择在一个特定列中都包含相同前 4 位数字的行？（希望使用 dplyr/tidyverse）

问题描述

1 个解决方案

解决方案1
0 2019-12-31 03:26:02

如何使用 num_range 选择在一个特定列中都包含相同前 4 位数字的行？ （希望使用 dplyr/tidyverse）

问题描述

1 个解决方案

解决方案1 0 2019-12-31 03:26:02

如何使用 num_range 选择在一个特定列中都包含相同前 4 位数字的行？（希望使用 dplyr/tidyverse）

解决方案1
0 2019-12-31 03:26:02