简体   繁体   中英

How can I prevent Hadoop's HDFS API from creating parent directories?

I want HDFS commands to fail if a parent directory doesn't exist when making subdirectories. When I use any of FileSystem#mkdirs , I find that an exception isn't risen, instead creating non-existent parent directories:

import java.util.UUID
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}

val conf = new Configuration()
conf.set("fs.defaultFS", s"hdfs://$host:$port")

val fileSystem = FileSystem.get(conf)
val cwd = fileSystem.getWorkingDirectory

// Guarantee non-existence by appending two UUIDs.
val dirToCreate = new Path(cwd, new Path(UUID.randomUUID.toString, UUID.randomUUID.toString))

fileSystem.mkdirs(dirToCreate)

Without the cumbersome burden of checking for the existence, how can I force HDFS to throw an exception if a parent directory doesn't exist?

The FileSystem API does not support this type of behavior. Instead, FileContext#mkdir should be used; for example:

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileContext, FileSystem, Path}
import org.apache.hadoop.fs.permission.FsPermission

val files = FileContext.getFileContext()
val cwd = files.getWorkingDirectory
val permissions = new FsPermission("644")
val createParent = false

// Guarantee non-existence by appending two UUIDs.
val dirToCreate = new Path(cwd, new Path(UUID.randomUUID.toString, UUID.randomUUID.toString))

files.mkdir(dirToCreate, permissions, createParent)

The above example will throw:

java.io.FileNotFoundException: Parent directory doesn't exist: /user/erip/f425a2c9-1007-487b-8488-d73d447c6f79

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM