简体   繁体   English

使用scala火花数据框操作行和列级别

[英]spark data frame operation row and column level useing scala

Original Data frame原始数据框
0.2 0.3 0.2 0.3

+------+------------- -+
|  name| country |
+------+---------------+
|Raju  |UAS         |
|Ram  |Pak.         |
|null    |China      |
|null    |null          |
+------+--------------+

  I Need  this 
+------+--------------+
|Nwet|wet Con |
+------+--------------+
|0.2   | 0.3           |
|0.2   | 0.3           |
|0.0   | 0.3.          |
|0.0   | 0.0           |
+------+--------------+

i want to create one Udf .我想创建一个 Udf 。 for Both the column对于两个列
which will apply to Name Column it check the if it not null then it return 0.2 return 0.0 .这将应用于 Name Column 它检查它是否不为 null 然后它返回 0.2 return 0.0 。 and same Udf apply to country column check if it null return 0.0 .并且相同的 Udf 适用于 country 列检查它是否为 null 返回 0.0 。 not null then it return 0.3不为空则返回 0.3

Using StringUtils of apache:使用 apache 的 StringUtils:

val transcodificationName: UserDefinedFunction =
    udf { (name: String) => {
        if (StringUtils.isBlank(name)) 0.0
        else 0.2
        }
    }
val transcodificationCountry: UserDefinedFunction =
    udf { (country: String) => {
        if (StringUtils.isBlank(country)) 0.0
        else 0.3
        }
    }

dataframe
    .withColumn("Nwet", transcodificationName(col("name"))).cast(DoubleType)
    .withColumn("wetCon", transcodificationCountry(col("country"))).cast(DoubleType)
    .select("Nwet", "wetcon")

edit:编辑:

val transcodificationColumns: UserDefinedFunction =
        udf { (input: String, columnName:String) => {
                if (StringUtils.isBlank(country)) 0.0
                else if(columnName.equals("name")) 0.2
                else if(columnName.equals("country") 0.3
                else 0.0
            }
        }


    dataframe
        .withColumn("Nwet", transcodificationColumns(col("name"), "name")).cast(DoubleType)
        .withColumn("wetCon", transcodificationColumns(col("country")), "country").cast(DoubleType)
        .select("Nwet", "wetcon")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM