Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Update documentation about crate organization #14304

Merged
merged 2 commits into from
Jan 27, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 25 additions & 3 deletions datafusion/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -624,19 +624,41 @@
//!
//! ## Crate Organization
//!
//! DataFusion is organized into multiple crates to enforce modularity
//! and improve compilation times. The crates are:
//! Most users interact with DataFusion via this crate (`datafusion`), which re-exports
//! all functionality needed to build and execute queries.
//!
//! There are three other crates that provide additional functionality that
//! must be used directly:
//! * [`datafusion_proto`]: Plan serialization and deserialization
//! * [`datafusion_substrait`]: Support for the substrait plan serialization format
//! * [`datafusion_sqllogictest`] : The DataFusion SQL logic test runner
//!
//! [`datafusion_proto`]: https://crates.io/crates/datafusion-proto
//! [`datafusion_substrait`]: https://crates.io/crates/datafusion-substrait
//! [`datafusion_sqllogictest`]: https://crates.io/crates/datafusion-sqllogictest
//!
//! DataFusion is internally split into multiple sub crates to
//! enforce modularity and improve compilation times. See the
//! [list of modules](#modules) for all available sub-crates. Major ones are
//!
//! * [datafusion_common]: Common traits and types
//! * [datafusion_catalog]: Catalog APIs such as [`SchemaProvider`] and [`CatalogProvider`]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has been quite a bit of progress towards modularizing datafusion since this section was first written ❤️

//! * [datafusion_execution]: State and structures needed for execution
//! * [datafusion_expr]: [`LogicalPlan`], [`Expr`] and related logical planning structure
//! * [datafusion_expr]: [`LogicalPlan`], [`Expr`] and related logical planning structure
//! * [datafusion_functions]: Scalar function packages
//! * [datafusion_functions_aggregate]: Aggregate functions such as `MIN`, `MAX`, `SUM`, etc
//! * [datafusion_functions_nested]: Scalar function packages for `ARRAY`s, `MAP`s and `STRUCT`s
//! * [datafusion_functions_table]: Table Functions such as `GENERATE_SERIES`
//! * [datafusion_functions_window]: Window functions such as `ROW_NUMBER`, `RANK`, etc
//! * [datafusion_optimizer]: [`OptimizerRule`]s and [`AnalyzerRule`]s
//! * [datafusion_physical_expr]: [`PhysicalExpr`] and related expressions
//! * [datafusion_physical_plan]: [`ExecutionPlan`] and related expressions
//! * [datafusion_physical_optimizer]: [`ExecutionPlan`] and related expressions
//! * [datafusion_sql]: SQL planner ([`SqlToRel`])
//!
//! [`SchemaProvider`]: datafusion_catalog::SchemaProvider
//! [`CatalogProvider`]: datafusion_catalog::CatalogProvider
//!
//! ## Citing DataFusion in Academic Papers
//!
//! You can use the following citation to reference DataFusion in academic papers:
Expand Down
Loading