[英]Read multiple csv file in apache beam using java
This code works well with just one file as input but when I pass:-此代码仅适用于一个文件作为输入,但是当我通过时:-
Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.nio.file.InvalidPathException: Illegal char <*> at index 17: D:\\beam\\csv\\20*.csv
at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at beam.wordcount.TestCsv.main(TestCsv.java:60)
Caused by: java.nio.file.InvalidPathException: Illegal char <*> at index 17: D:\\beam\\csv\\20*.csv
at sun.nio.fs.WindowsPathParser.normalize(Unknown Source)
at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
at sun.nio.fs.WindowsPath.parse(Unknown Source)
at sun.nio.fs.WindowsFileSystem.getPath(Unknown Source)
at java.nio.file.Paths.get(Unknown Source)
at org.apache.beam.sdk.io.LocalFileSystem.matchOne(LocalFileSystem.java:217)
at org.apache.beam.sdk.io.LocalFileSystem.match(LocalFileSystem.java:90)
at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:119)
at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:140)
at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:152)
at org.apache.beam.sdk.io.FileIO$MatchAll$MatchFn.process(FileIO.java:636)
I don't know why it is throwing error, * is used to read multiple files with similar type我不知道为什么会抛出错误, * 用于读取多个具有相似类型的文件
CODE代码
public interface BatchOptions extends PipelineOptions {
@Description("Path to the data file(s) containing game data.")
@Default.String("D:\\beam\\csv\\2020.csv")
String getInput();
void setInput(String value);
}
public static void main(String[] args) {
BatchOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().as(BatchOptions.class);
Pipeline pipeline = Pipeline.create(options);
PCollection lines=pipeline
.apply(FileIO.match().filepattern(options.getInput()))
.apply(FileIO.readMatches());
herepipeline.run().waitUntilFinish();
}
WindowsFileSystem
does not expand * and treat it as special character. WindowsFileSystem
不会扩展 * 并将其视为特殊字符。 I would recommend passing the complete directory like D://beam//csv//
我建议传递完整的目录,例如
D://beam//csv//
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.