简体   繁体   English

由于旧库,我无法在hdinsights中使用dplyrXdf

[英]I am unable to use dplyrXdf in hdinsights due to old libraries

I wrote a script using RevoScaleR and dplyrXdf, to my surprise when using HDInsights (Microsoft Azure managed Spark cluster service) I get an installation of R 3.3.3 and I can't install dplyrXdf, neither is the package in the repository nor can I install from git using devtools, I managed to get it installed once updating every single dependence from it's respective github repository but this is madness, took me hours... The biggest issue seems to be dplyr 0.5 which is the latest avaiable package for this service (current CRAN package is 0.7.4) Am I doing something wrong? 我使用RevoScaleR和dplyrXdf编写了一个脚本,使用HDInsights(Microsoft Azure托管Spark群集服务)时感到惊讶,我安装了R 3.3.3,但无法安装dplyrXdf,存储库中的软件包也不可以,使用devtools从git安装,一旦从其相应的github存储库更新了每个依赖关系,我就设法安装了它,但是这很疯狂,花了我几个小时...最大的问题似乎是dplyr 0.5,这是该服务的最新可用软件包(当前的CRAN软件包为0.7.4)我在做错什么吗? maybe something in provisioning (like selecting the wrong type of cluster)? 可能在配置中有些事情(例如选择错误的集群类型)? I can not believe MS would put so much work in R and not update it's cluster service, I must be missing something here. 我不敢相信MS会在R中投入太多工作而不更新其集群服务,我肯定在这里缺少了一些东西。

You can install all dependencies rather quickly - it took me about 20 minutes. 您可以相当快地安装所有依赖项-我花了大约20分钟的时间。 Just look at the error messages and install the packages stated. 只需查看错误消息并安装说明的软件包即可。 I needed only these ones 我只需要这些

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM