[英]Turn many features in a data frame with spark ML
我一直在跟隨本教程https://mapr.com/blog/churn-prediction-sparkml/,並且我意識到csv結構必須像這樣手動編寫:
val schema = StructType(Array(
StructField("state", StringType, true),
StructField("len", IntegerType, true),
StructField("acode", StringType, true),
StructField("intlplan", StringType, true),
StructField("vplan", StringType, true),
StructField("numvmail", DoubleType, true),
StructField("tdmins", DoubleType, true),
StructField("tdcalls", DoubleType, true),
StructField("tdcharge", DoubleType, true),
StructField("temins", DoubleType, true),
StructField("tecalls", DoubleType, true),
StructField("techarge", DoubleType, true),
StructField("tnmins", DoubleType, true),
StructField("tncalls", DoubleType, true),
StructField("tncharge", DoubleType, true),
StructField("timins", DoubleType, true),
StructField("ticalls", DoubleType, true),
StructField("ticharge", DoubleType, true),
StructField("numcs", DoubleType, true),
StructField("churn", StringType, true)
但是我有一個具有335個特征的數據集,所以我不想全部編寫它們...是否有一種簡單的方法來檢索它們並相應地定義模式?
我在這里找到了解決方案: https : //dzone.com/articles/using-apache-spark-dataframes-for-processing-of-ta比我想象的要容易
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.