Skip to content
James Turton edited this page Dec 1, 2021 · 19 revisions

Drill 2.0 Proposal

This page serves to document a proposal for Drill 2.0. At the time of writing, we are currently gearing up to release Drill 1.20.0 which means that we have released 19 versions of Drill since it was deemed stable enough to warrant a 1.0 label. Since the second phase of this project's life began there have been some things which have been discussed which are breaking changes. This page serves to document proposed breaking changes that could be included in a Drill 2.0.

Please feel free to add your ideas in the knowledge that a subset will be ultimately be selected by the dev team as the basis for a 2.0 release. Changes recorded here need not neccessarily be user breaking. Anything that is a significant change from how Drill 1.x works is welcome.

APIs and connectors

Config system

  • Add a shared component that applies configuration priorities (... session opt > storage/format config opt > system opt ...) and make all plugins use this component for reading options.

General

  • Remove deprecated code.

Query planner

  • Rebase on current Calcite and review our customisations.

Project structure, packaging and distribution

  • Split Drill's monorepo into multiple parts. The current repo would be the core while the contrib stuff could move to its own repo(s) under the Drill project.
  • Ensure we have a good way to build plugins separate from the Drill code.
  • Explain how to create a plugin in a users own repo, built against Drill.
  • Explain how to drop the plugin into a running Drill to add that functionality.

Additional context for the three items above can be found in the conversation in #2359.

  • Split Drill installation packages into "core" and "extra".
  • Install plugins and UDFs from an online marketplace, a la the Eclipse marketplace.

SQL Functions

  • Unify nearestdate and date_trunc

Storage and format plugins

  • Scrap the columns[] array wherever it occurs (only TextReader?) in favour of distinct, numbered fields column1, column2, column3, ...
  • Increase consistency of available options across plugins
    • Column name and type information only allowed in provided schema not format config.
    • Standard pushdown enable/disable switches for storage plugins.
  • Use leading underscore for all implicit fields. Some plugins and connectors do this, but some don't. Particularly the file plugin in core Drill. This could lead to strange results if a file has a column called file.
  • In INFORMATION_SCHEMA, replace hard coded "DRILL" catalog with storage plugin names?

Vector layer

  • Do something about ObjectHolder.
  • Employ new Java SIMD instrinsics?
  • Drop union type?

Web UI

  • Perhaps a refresh might be in order?
Clone this wiki locally