-
-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
support local duckdb computation for remote filesystems
- Added support for copying remote files to a local temporary directory before processing with DuckDB. - Improved error handling during remote file copy process. - Added concurrency control for efficient parallel file copying. - Implemented a function to find the deepest common parent path for better file organization during copying. - Enhanced file copy process to handle directories recursively. - Modified `GetDataflowViaDuckDB` to handle remote files by copying them locally. - Updated `FileStreamConfig` to include a `GetProp` and `SetProp` methods for managing properties. - Added a `working_dir` property in `DuckDb` to specify the working directory for DuckDB processes. - Adjusted DuckDB query generation to handle paths relative to the working directory. - Added support for setting properties in `FileStreamConfig` and retrieving them in `DuckDB`. - Implemented a mechanism to automatically clean up temporary local files after processing. - Improved logging to provide more detailed information during file copy and processing. - Modified `GetDataflowViaDuckDB` to use relative paths when `working_dir` is set. - Changed the way incremental keys are handled in `MakeScanQuery`, excluding the reserved column `_sling_loaded_at`. handle errors during remote file copy and duckdb processing - Improved error handling for file copying, providing more specific error messages. - Added error checking and handling throughout the remote file copy process to improve robustness. - Modified the error handling to provide more context and details for better debugging. - Improved error messages and logging to pinpoint issues during DuckDB processing. ♻️ refactor(dbio): optimize duckdb file processing and improve code structure - Refactored code to enhance readability and maintainability. - Optimized the code for better performance and efficiency in file handling. - Improved code structure and organization to improve clarity and understanding. - Updated function signatures and variable names to enhance code readability. - Simplified the logic for processing files with DuckDB. - Restructured the code to follow a more consistent style.
- Loading branch information
Showing
3 changed files
with
145 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters