Databricks - FileNotFoundException

Question

I'm sorry if this is basic and I missed something simple. I'm trying to run the code below to iterate through files in a folder and merge all files that start with a specific string, into a dataframe. All files sit in a lake.

file_list=[]
path = "/dbfs/rawdata/2019/01/01/parent/"
files  = dbutils.fs.ls(path)
for file in files:
    if(file.name.startswith("CW")):
       file_list.append(file.name)
df = spark.read.load(path=file_list)

# check point
print("Shape: ", df.count(),"," , len(df.columns))
db.printSchema()

This looks fine to me, but apparently something is wrong here. I'm getting an error on this line:
files = dbutils.fs.ls(path)

Error message reads:

java.io.FileNotFoundException: File/6199764716474501/dbfs/rawdata/2019/01/01/parent does not exist.

The path, the files, and everything else definitely exist. I tried with and without the 'dbfs' part. Could it be a permission issue? Something else? I Googled for a solution. Still can't get traction with this.

Answer 1

Make sure you have a folder named "dbfs" if your parent folder starts from "rawdata" the path should be "/rawdata/2019/01/01/parent" or "rawdata/2019/01/01/parent".

The error is thrown in case of incorrect path.

Answer 2

This is an old thread, but if someone is still looking for a solution: It does require path to be listed as: "dbfs:/rawdata/2019/01/01/parent/"

Databricks - FileNotFoundException

Question

2 answers

solution1
1 ACCPTED 2019-10-03 14:44:27

solution2
0 2023-01-20 23:28:12

Databricks - FileNotFoundException

Question

2 answers

solution1 1 ACCPTED 2019-10-03 14:44:27

solution2 0 2023-01-20 23:28:12

solution1
1 ACCPTED 2019-10-03 14:44:27

solution2
0 2023-01-20 23:28:12