我可以在运行时将输入文件或输入数据追加到map-reduce作业中而不创建竞争条件吗?
I think in theory you can add more files into the input as long as it:
Regarding the race condition after splits are computed, note that append to existing files is only available since the version 0.21.0 .
And even if you can modify your files, your split points already precomputed and most likely your new data will not be picked up by mappers. Though, I doubt that it will lead to a crash of your flow.
What you can experiment with is to disable splits within a file (that is assign a mapper per file) and try to append. I think some data that had a chance to get flushed may end up in a mapper (that's just my wild guess).
Effectively the answer is "no". The splits are computed very early in the game: and after that your new files will not be included.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.