繁体   English   中英

使用Spark ML转换数据框中的许多功能

[英]Turn many features in a data frame with spark ML

我一直在跟随本教程https://mapr.com/blog/churn-prediction-sparkml/,并且我意识到csv结构必须像这样手动编写:

val schema = StructType(Array(
    StructField("state", StringType, true),
    StructField("len", IntegerType, true),
    StructField("acode", StringType, true),
    StructField("intlplan", StringType, true),
    StructField("vplan", StringType, true),
    StructField("numvmail", DoubleType, true),
    StructField("tdmins", DoubleType, true),
    StructField("tdcalls", DoubleType, true),
    StructField("tdcharge", DoubleType, true),
    StructField("temins", DoubleType, true),
    StructField("tecalls", DoubleType, true),
    StructField("techarge", DoubleType, true),
    StructField("tnmins", DoubleType, true),
    StructField("tncalls", DoubleType, true),
    StructField("tncharge", DoubleType, true),
    StructField("timins", DoubleType, true),
    StructField("ticalls", DoubleType, true),
    StructField("ticharge", DoubleType, true),
    StructField("numcs", DoubleType, true),
    StructField("churn", StringType, true)

但是我有一个具有335个特征的数据集,所以我不想全部编写它们...是否有一种简单的方法来检索它们并相应地定义模式?

我在这里找到了解决方案: https : //dzone.com/articles/using-apache-spark-dataframes-for-processing-of-ta比我想象的要容易

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM