简体   繁体   中英

Upload Spark RDD to REST webservice POST method

Frankly i'm not sure if this feature exist?sorry for that

My requirement is to send spark analysed data to file server on daily basis, file server supports file transfer through SFTP and REST Webservice post call.

Initial thought was to save Spark RDD to HDFS and transfer to fileserver through SFTP. I would like to know is it possible to upload the RDD directly by calling REST service from spark driver class without saving to HDFS. Size of the data is less than 2MB

Sorry for my bad english!

There is no specific way to do that with Spark. With that kind of data size it will not be worth it to go through HDFS or another type of storage. You can collect that data in your driver's memory and send it directly. For a POST call you can just use plain old java.net.URL , which would look something like this:

import java.net.{URL, HttpURLConnection}

// The RDD you want to send
val rdd = ???

// Gather data and turn into string with newlines
val body = rdd.collect.mkString("\n")

// Open a connection
val url = new URL("http://www.example.com/resource")
val conn = url.openConnection.asInstanceOf[HttpURLConnection]

// Configure for POST request
conn.setDoOutput(true);
conn.setRequestMethod("POST");

val os = conn.getOutputStream;
os.write(input.getBytes);
os.flush;

A much more complete discussion of using java.net.URL can be found at this question . You could also use a Scala library to handle the ugly Java stuff for you, like akka-http or Dispatch .

Spark itself does not provide this functionality (it is not a general-purpose http client). You might consider using some existing rest client library such as akka-http, spray or some other java/scala client library.

That said, you are by no means obliged to save your data to disk before operating on it. You could for example use collect() or foreach methods on your RDD in combination with your REST client library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM