繁体   English   中英

使用并行 mcmapply 或 mclapply 与现有 function

[英]Using parallel mcmapply or mclapply with existing function

我正在尝试将 tokenizers::tokenize_sentences function 应用于很长的字符列表。 一个简短的例子如下:

list <- c("testing one two three. Testing one two three.", "testing two three four. Testing two three four. Testing two three four.", 
"testing three four five. Testing three four five. Testing three four five. Testing three four five."
)

由于我想并行化的长度。 问题是并行似乎只有自定义函数的示例。 我试图将 tokenize_sentences 定义为自定义 function 以使用 mclapply,但我认为它将 function 的输出传递给 mclapply 调用中的调用,我得到一个包含 3 个列表的列表,而不仅仅是三个列表。

fx <- function(tknz){tknz <- tokenize_sentences(list)}
d$sentences <- mclapply(list, fx, mc.cores = 23)

Output如下:

list(list(c("testing one two three.", "Testing one two three."
), c("testing two three four.", "Testing two three four.", "Testing two three four."
), c("testing three four five.", "Testing three four five.", 
"Testing three four five.", "Testing three four five.")), list(
    c("testing one two three.", "Testing one two three."), c("testing two three four.", 
    "Testing two three four.", "Testing two three four."), c("testing three four five.", 
    "Testing three four five.", "Testing three four five.", "Testing three four five."
    )), list(c("testing one two three.", "Testing one two three."
), c("testing two three four.", "Testing two three four.", "Testing two three four."
), c("testing three four five.", "Testing three four five.", 
"Testing three four five.", "Testing three four five.")))

所需的 output:

list(c("testing one two three.", "Testing one two three."), c("testing two three four.", 
"Testing two three four.", "Testing two three four."), c("testing three four five.", 
"Testing three four five.", "Testing three four five.", "Testing three four five."
))

我在 mapply 方面没有取得太大进展。 代码如下,但似乎没有应用 function。

d$sentences <- mcmapply(list, FUN = function(x){tokenize_sentences})
dput(dsentences)
list(`testing one two three. Testing one two three.` = function (x, 
    lowercase = FALSE, strip_punct = FALSE, simplify = FALSE) 
{
    UseMethod("tokenize_sentences")
}, `testing two three four. Testing two three four. Testing two three four.` = function (x, 
    lowercase = FALSE, strip_punct = FALSE, simplify = FALSE) 
{
    UseMethod("tokenize_sentences")
}, `testing three four five. Testing three four five. Testing three four five. Testing three four five.` = function (x, 
    lowercase = FALSE, strip_punct = FALSE, simplify = FALSE) 
{
    UseMethod("tokenize_sentences")
})

任何帮助表示赞赏。 如果有更好的方法,则不必是这些方法; 只需要并行化这个列表中的 tokenize_sentences function。 谢谢!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM