簡體   English   中英

遍歷ar數據框並將行作為參數傳遞給函數

[英]loop through a r dataframe and pass rows as parameters to a function

我想遍歷數據框並將行作為參數傳遞給函數,以匯總名為df3的數據框的總數。

我已經嘗試過使用傳統的for循環編寫代碼,但是沒有結果。

我已經在https://adv-r.hadley.nz/functionals.html#pmap中查看了pmap

但我看不到如何將此示例應用於我的代碼。

以下是原始數據中的一些數據:

dput(head(df3,n=3))
structure(list(id = c("81", "83", "85"), look_work = c("yes", 
"yes", "yes"), current_work = c("no", "yes", "no"), hf_l5k = c("", 
"", ""), ac_l5k = c("", "", ""), hf_5_10k = c("", "1", "1"), 
    ac_5_10k = c("", "1", "1"), hf_11_20k = c("", "", ""), ac_11_20k = c("", 
    "", ""), hf_21_50k = c("", "", ""), ac_21_50k = c("", "", 
    ""), hf_51_100k = c("", "", ""), ac_51_100k = c("", "", ""
    ), hf_m100k = c("", "", ""), ac_m100k = c("", "", ""), s_l1000 = c("", 
    "", ""), se_l1000 = c("", "", "1"), s_1001_1500 = c("", "1", 
    "1"), se_1001_1500 = c("", "", ""), s_2001_3000 = c("", "", 
    ""), se_2001_3000 = c("", "1", ""), s_3001_4000 = c("", "", 
    ""), se_3001_4000 = c("", "", ""), s_4001_5000 = c("", "", 
    ""), se_4001_5000 = c("", "", ""), s_5001_6000 = c("", "", 
    ""), se_5001_6000 = c("", "", ""), s_m6000 = c("", "", ""
    ), se_m6000 = c("", "", ""), s_n_ans = c("", "", ""), se_n_ans = c("", 
    "", ""), before_work = c("no", "NULL", "yes"), keen_move = c("yes", 
    "yes", "no"), city_size = c("village", "more than 500k inhabitants", 
    "more than 500k inhabitants"), gender = c("male", "female", 
    "female"), age = c("18 - 24 years", "18 - 24 years", "more than 50 years"
    ), education = c("secondary", "vocational", "secondary")), row.names = c(NA, 
3L), class = "data.frame")

這是參數的數據框hf_names:

structure(list(hf_names = c("hf_l5k", "hf_5_10k", "hf_11_20k", 
"hf_21_50k", "hf_51_100k", "hf_m100k"), job = c("hf_l5k_job", 
"hf_5_10k_job", "hf_11_20k_job", "hf_21_50k_job", "hf_51_100k_job", 
"hf_m100k_job"), tot = c("hf_l5k_tot", "hf_5_10k_tot", "hf_11_20k_tot", 
"hf_21_50k_tot", "hf_51_100k_tot", "hf_m100k_tot")), class = "data.frame", row.names = c(NA, 
-6L))

這是我嘗試使用傳統的for循環的代碼:

library(dplyr)

tot_function <- function(df, filter_tot, col_name1, col_name2) {
  # filter desired columns for all jobs
  filter_tot <- df %>% filter(col_name1=="1") %>% 
  summarise(col_name2 = n()) 
}

for (i in seq_along(hf_names3)) {
  tot_function(df3, hf_names3$tot[i], hf_names3$hf_names[i], hf_names3$job[i])

}

預期結果將是數據幀或向量:

hf_l5k_jobs hf_l5_10k_jobs
10               193

但是此代碼不會處理任何簡單的功能,例如trim和runif,因此不會生成任何內容。

我認為您不必為此過於復雜。 您可以從hf_names獲取名稱,從df3該列的子集,並計算該列中1的數量。

sapply(hf_names$hf_names, function(x) sum(df3[[x]] == 1))

#    hf_l5k   hf_5_10k  hf_11_20k  hf_21_50k hf_51_100k   hf_m100k 
#         0          2          0          0          0          0 

如果你喜歡tidyverse你可以改變sapplymap.*的變化

purrr::map_int(hf_names$hf_names, ~sum(df3[[.]] == 1))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM