如何使用 num_range 選擇在一個特定列中都包含相同前 4 位數字的行？（希望使用 dplyr/tidyverse）

Question

我的問題最好分為兩部分：

我正在處理一個數據集，該數據集查看了許多國家/地區的林產品使用情況。 每行代表來自這些國家中任何一個國家的一個家庭（總共約 30 個）。 每個國家都有一個代碼（4 位），但數據集中沒有國家代碼列。 您可以通過使用家庭 ID（“ghousehold”）來推斷哪些家庭來自哪個國家/地區。 Ghousecode 是一個 7 位代碼，前 4 位是國家代碼。 例如，如果玻利維亞是國家代碼：3024，那么玻利維亞的一個家庭可能是 3024105 或 3024999...

我想要一個代碼來選擇特定國家/地區的所有條目。 我正在使用 tidyverse，所以我想使用 select() 和 num_range() 但它沒有用。 我沒有收到錯誤消息，但是當我查看我的輸出時，我可以看出它沒有工作。 這是我當前的代碼：

    #forest_use_tibble is a tibble with observations on forest usage from many countries
    #I selected a subset of the original file's variables. 

    forest_use_simpler <- select(forest_use_tibble, ghousecode, year, product, income, amount, unit)

    #take Bolivia, whose country ID is 3024. This means that each ghousecode that begins with 
     3024 is from Bolivia. 
    #but each ghousecode is 3024xxx with three other numbers after it.

    x = 3024
    Bolivia <- select(forest_use_simpler, num_range("x", 001:999), everything())

    #my goal: a new tibble/dataframe that has only the entries from Bolivia
    #there is no separate column for country ID, unfortunately.

有任何想法嗎？

問題的第二部分：有沒有辦法只查詢 num_range 的一列（即變量，在本例中為 ghousecode）？ 我上面的方式讓我印象深刻，就像它會搜索forest_use_simple中的所有變量一樣，所以如果數字3024出現在ghousecode以外的其他地方，它就有可能包括另一個國家的家庭。

謝謝！

（注意：我也試過直接在 x 無效的地方輸入 3024。再次感謝所有幫助。）

Answer 1

如果ghousecode的格式始終為 7 位數字，那么這樣的事情怎么樣？

library(tidyverse)

df <-
  tibble(
    ghousecode = c(2039434, 3024105),
    year = c(2019, 2019)
  )

df %>% 
  mutate(country_code = floor(ghousecode / 1000)) %>% 
  filter(country_code == 3024)

select選擇列，而filter選擇行。

如何使用 num_range 選擇在一個特定列中都包含相同前 4 位數字的行？（希望使用 dplyr/tidyverse）

問題描述

1 個解決方案

解決方案1
0 2019-12-31 03:26:02

如何使用 num_range 選擇在一個特定列中都包含相同前 4 位數字的行？ （希望使用 dplyr/tidyverse）

問題描述

1 個解決方案

解決方案1 0 2019-12-31 03:26:02

如何使用 num_range 選擇在一個特定列中都包含相同前 4 位數字的行？（希望使用 dplyr/tidyverse）

解決方案1
0 2019-12-31 03:26:02