[英]Scala spark data frame phone number validation
val df = Seq[(String)]("", " ", null, "123456789a", "1111111111", "1.3-4567 80", " 1.23-4567 890 ", "1234567890").toDF("PhoneNumber")
val trimmed = regexp_replace(trim($"PhoneNumber"), "[ .-]", "")
val correct = trimmed.rlike(raw"\d{10,}") &&
!(trimmed.rlike(raw"^(\d)\1*$$"))
val df2 = df.withColumn("Correct", when(correct, "Y").otherwise("N"))
df2.show()
// +---------------+-------+
// | PhoneNumber|Correct|
// +---------------+-------+
// | | N|
// | | N|
// | null| N|
// | 123456789a| N|
// | 1111111111| N|
// | 1.3-4567 80| N|
// | 1.23-4567 890 | Y|
// | 1234567890| Y|
// +---------------+-------+
trim($"PhoneNumber")
removes leading and trailing spaces trim($"PhoneNumber")
删除前导和尾随空格regexp_replace(..., "[ .-]", "")
removes spaces, dots and commas regexp_replace(..., "[ .-]", "")
删除空格、点和逗号.rlike(raw"\d{10,}")
checks for 10 or more digits .rlike(raw"\d{10,}")
检查 10 位或更多位!(....rlike(raw"^(\d)\1*$$"))
checks for all the same digits !(....rlike(raw"^(\d)\1*$$"))
检查所有相同的数字
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.