簡體   English   中英

為什么 lapply 不轉發額外的參數?

[英]Why is lapply not forwarding additional arguments?

我有一個很大的推文數據集,其中每一行都是一個唯一的推文,我有一個關鍵字列表,如果變量text中存在一個或多個,我想從這些推文中提取這些關鍵字。 此關鍵字列表已編譯為正則表達式(保存在變量search_key 中),包括一些環視和其他條件。

如果使用以下代碼,則字符串的提取工作得非常好:

data$keyword <- stri_extract_all(str = data$text, regex = search_key)

但是為了優化/並行化代碼,我想使用 apply 系列中的函數。 但是在執行以下行之一時,我總是收到錯誤,因為regex -argument 沒有傳遞給stri_extract_all -function

data$keyword <- lapply(data$text, FUN = stri_extract_all(), regex = search_key)
data$keyword <- lapply(data$text, FUN = stri_extract_all(), regex = get(search_key))
data$keyword <- lapply(data$text, FUN = stri_extract_all(), ... = "regex=search_key")

此行為的發生與search_keytext變量的內容無關,因此任何文本列和任何工作正則表達式都可用於測試。 以下數據是我的數據的簡化版本,也可以使用:

data <- structure(list(status_id = c(1112765520644894720, 1112938379296104448, 
1112587129622876160, 1113006196259196928, 1112840488208531456
), text = c("@LaraFukuro more frilly stuff but i actually found a matching carrot bag which also screamed \"LARA\" inside me xD", 
"@EuroMasochismo @VaeVictis @AlbertoBagnai @Comunardo La selezione fatta a dodici anni favorisce chi è seguito. È come selezionare a 4 anni chi deve giocare a pallone proibendolo a tutti gli altri ...", 
"@SignorErnesto @Cr1st14nM3s14n0 @ggargiulo3 @micheleboldrin Sbagliato io.", 
"@BrownResearchGT On Aconcagua, the permit requires climbers above basecamp to collect their waste and carry it back down where it's taken away by helicopter. They actually weigh the bag! And still, most small rocks had human feces underneath. It's a problem!\r\nHopefully @DenaliNPS will follow suit. ", 
"@Jenn198523 Once you silence a person &amp; cover them with a huge trash bag, beating &amp; killing are not far behind."
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))

search_key <- "(?<=(^|\\s|\\D))([:alnum:]*|@[:alnum:]*|#[:alnum:]*)bag([:alnum:]*)(?=(\\D|\\s|$))"



我犯了什么錯誤,如何解決?
當然,也歡迎任何有關優化此類任務的建議。

stri_extract_all已經在str上進行了矢量化。 您不需要將它包含在lapply ,如果這樣做,您的代碼會顯着減慢。

data$keyword <- stri_extract_all(data$text, regex = search_key)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM