I am using polars with Rust and I would like to be able to read multiple csv files as input.
I found this section in the documentation that shows how to use glob patterns to read multiple files using Python, but I could not find a way to do this in Rust.
Trying the glob pattern with Rust does not work.
The code I tried was
use polars::prelude::*;
fn main() {
let df = CsvReader::from_path("./example/*.csv").unwrap().finish().unwrap();
println!("{:?}", df);
}
And this failed with the error
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Os { code: 2, kind: NotFound, message: "No such file or directory" })', src/main.rs:26:54
stack backtrace:
0: rust_begin_unwind
I also tried creating the Path independently and confirm the path represents a directory,
use std::path::PathBuf;
use polars::prelude::*;
fn main() {
let path = PathBuf::from("./example");
println!("{}", path.is_dir());
let df = CsvReader::from_path(path).unwrap().finish().unwrap();
println!("{:?}", df);
}
it also fails with the same error.
So question is how do I read multiple CSV/Parquet/JSON etc files from a directory using Rust?
The section of the documentation referenced in your question uses both the library glob
and a for
loop in python
.
Thus, we can write the rust
version implementing similar ideas as follows:
use std::path::PathBuf;
use glob::glob;
use polars::prelude::*;
fn main() {
let csv_files = glob("my-file-path/*csv")
.expect("No CSV files in target directory");
let mut dfs: Vec<PolarsResult<DataFrame>> = Vec::new();
for entry in csv_files {
dfs.push(read_csv(entry.unwrap().to_path_buf()));
}
println!("dfs: {:?}", dfs);
}
fn read_csv(filepath: PathBuf) -> PolarsResult<DataFrame> {
CsvReader::from_path(filepath)?
.has_header(true)
.finish()
}
fn read_csv_lazy(filepath: PathBuf) -> PolarsResult<LazyFrame> {
LazyCsvReader::new(filepath).has_header(true).finish()
}
fn main() {
let mut ldfs: Vec<PolarsResult<LazyFrame>> = Vec::new();
for entry in csv_files {
ldfs.push(read_csv_lazy(entry.unwrap().to_path_buf()));
}
// do stuff
for f in ldfs.into_iter() {
println!("{:?}", f.unwrap().collect())
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.