This is actually a design question / problem. And I am not sure if writing and reading the file is an ideal solution here. Nonetheless, I will outline what I am trying to do below: I have the following static method that once the reqStreamingData
method of obj
is called, it starts retrieving data from client server constantly at a rate of 150 milliseconds.
public static void streamingDataOperations(ClientSocket cs) throws InterruptedException, IOException{
// call - retrieve streaming data constantly from client server,
// and write a line in the csv file at a rate of 150 milliseconds
// using bufferedWriter and printWriter (print method).
// Note that the flush method of bufferedWriter is never called,
// I would assume the data is in fact being written in buffered memory
// not the actual file.
cs.reqStreamingData(output_file); // <- this method comes from client's API.
// I would like to another thread (aka data processing thread) which repeats itself every 15 minutes.
// I am aware I can do that by creating a class that extends TimeTask and fix a schedule
// Now when this thread runs, there are things I want to do.
// 1. flush last 15 minutes of data to the output_file (Note no synchronized statement method or statements are used here, hence no object is being locked.)
// 2. process the data in R
// 3. wait for the output in R to come back
// 4. clear file contents, so that it always store data that only occurs in the last 15 minutes
}
Now, I am not well versed in multithreading. My concern is that
I understand that this is a lengthy problem. Please let me know if you need more information.
The lines (CSV, or any other text) can be written to a temporary file. When processing is ready to pick up, the only synchronization needed occurs when the temporary file is getting replaced by the new one. This guarantees that the producer never writes to the file that is being processed by the consumer at the same time.
Once that is done, producer continues to add lines to the newer file. The consumer flushes and closes the old file, and then moves it to the file as expected by your R-application.
To further clarify the approach, here is a sample implementation:
public static void main(String[] args) throws IOException {
// in this sample these dirs are supposed to exist
final String workingDirectory = "./data/tmp";
final String outputDirectory = "./data/csv";
final String outputFilename = "r.out";
final int addIntervalSeconds = 1;
final int drainIntervalSeconds = 5;
final FileBasedTextBatch batch = new FileBasedTextBatch(Paths.get(workingDirectory));
final ScheduledExecutorService executor = Executors.newScheduledThreadPool(1);
final ScheduledFuture<?> producer = executor.scheduleAtFixedRate(
() -> batch.add(
// adding formatted date/time to imitate another CSV line
LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME)
),
0, addIntervalSeconds, TimeUnit.SECONDS);
final ScheduledFuture<?> consumer = executor.scheduleAtFixedRate(
() -> batch.drainTo(Paths.get(outputDirectory, outputFilename)),
0, drainIntervalSeconds, TimeUnit.SECONDS);
try {
// awaiting some limited time for demonstration
producer.get(30, TimeUnit.SECONDS);
}
catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
catch (ExecutionException e) {
System.err.println("Producer failed: " + e);
}
catch (TimeoutException e) {
System.out.println("Finishing producer/consumer...");
producer.cancel(true);
consumer.cancel(true);
}
executor.shutdown();
}
static class FileBasedTextBatch {
private final Object lock = new Object();
private final Path workingDir;
private Output output;
public FileBasedTextBatch(Path workingDir) throws IOException {
this.workingDir = workingDir;
output = new Output(this.workingDir);
}
/**
* Adds another line of text to the batch.
*/
public void add(String textLine) {
synchronized (lock) {
output.writer.println(textLine);
}
}
/**
* Moves currently collected batch to the file at the specified path.
* The file will be overwritten if exists.
*/
public void drainTo(Path targetPath) {
try {
final long startNanos = System.nanoTime();
final Output output = getAndSwapOutput();
final long elapsedMillis =
TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNanos);
System.out.printf("Replaced the output in %d millis%n", elapsedMillis);
output.close();
Files.move(
output.file,
targetPath,
StandardCopyOption.ATOMIC_MOVE,
StandardCopyOption.REPLACE_EXISTING
);
}
catch (IOException e) {
System.err.println("Failed to drain: " + e);
throw new IllegalStateException(e);
}
}
/**
* Replaces the current output with the new one, returning the old one.
* The method is supposed to execute very quickly to avoid delaying the producer thread.
*/
private Output getAndSwapOutput() throws IOException {
synchronized (lock) {
final Output prev = this.output;
this.output = new Output(this.workingDir);
return prev;
}
}
}
static class Output {
final Path file;
final PrintWriter writer;
Output(Path workingDir) throws IOException {
// performs very well on local filesystems when working directory is empty;
// if too slow, maybe replaced with UUID based name generation
this.file = Files.createTempFile(workingDir, "csv", ".tmp");
this.writer = new PrintWriter(Files.newBufferedWriter(this.file));
}
void close() {
if (this.writer != null)
this.writer.flush();
this.writer.close();
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.