I am working in a Jupyter Notebook with PySpark v2.3.4 which runs on Java 8, Python 3.6 (with py4j==0.10.7), and Scala 2.11, and I have a Scala case class that takes in a scala.util.matching.Regex
( scala doc ) as an arg like so:
case class myClass(myString: String, myRegex: Regex)
I would like to construct an object from myClass
but I can't seem to figure out how to construct a scala.util.matching.Regex
object in a Python / PySpark environment. Below are a couple of my attempts/docs I've followed to create a Scala regex where sc
is my SparkContext.
sc._jvm.scala.util.matching.Regex("""(S|s)cala""")
Constructor scala.util.matching.Regex([class java.lang.String]) does not exist
java.lang.String
.sc._jvm.scala.util.matching.Regex("(S|s)cala")
sc._jvm.scala.util.matching.Regex(r"(S|s)cala")
sc._jvm.scala.util.matching.Regex("(S|s)cala".r)
(the way they do it in Scala)
sc._jvm.java.util.regex.Pattern.compile("(S|s)cala")
successfully creates a Java regex pattern -- and the scala doc clearly states that the Scala regex delegates to the Java regex package... Any help/advice would be much appreciated! Thanks in advance!
I figured it out lol
Scala Regex takes a second argument called groupNames
. It is a variable String argument that should accept args of length 0 to as many String arguments as you want. However, from a Python interpreter, this is seen as a required argument, so you must pass a None
into this argument to indicate nothing to populate this second arg.
sc._jvm.scala.util.matching.Regex("(S|s)cala", None)
NOTE: I haven't figured out how to pass in a vararg yet though... Passing comma separated strings, array, and tuples didn't work... Any help on that would be great thanks:)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.