简体   繁体   English

SparkR与众不同(在数据块上)

[英]SparkR distinct (on databricks)

I am new to SparkR, so please forgive if my question is very basic. 我是SparkR的新手,所以如果我的问题很基本,请原谅。

I work on databricks and try to get all unique dates of a column of a SparkDataFrame. 我处理数据块,并尝试获取SparkDataFrame列的所有唯一日期。

When I run: 当我跑步时:

uniquedays <- SparkR::distinct(df$datadate)

I get the error message: 我收到错误消息:

unable to find an inherited method for function ‘distinct’ for signature ‘"Column"’

On Stack Overflow , I found out that this usually means (If I run isS4(df), it returns TRUE): Stack Overflow上 ,我发现这通常意味着(如果运行isS4(df),则返回TRUE):

That is the type of message you will get when attempting to apply an S4 generic function to an object of a class for which no defined S4 method exists 这是您尝试将S4泛型函数应用于不存在定义的S4方法的类的对象时收到的消息的类型

I also tried to run 我也试着跑

uniquedays <- SparkR::unique(df$datadate)

where I get the error message: 我收到错误消息的地方:

unique() applies only to vectors

It feels like, I am missing something basic here. 感觉上,我在这里缺少一些基本知识。 Thank you for your help! 谢谢您的帮助!

Try this: 尝试这个:

library(magrittr)
uniquedays <- SparkR::select(df, df$datadate) %>% SparkR::distinct()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM