Skip to content

Apache Pinot Release 1.0.0

Compare
Choose a tag to compare
@saurabhd336 saurabhd336 released this 12 Sep 08:21
· 2124 commits to master since this release

What's changed

Major features and updates

Multistage Engine

Multistage engine new features:

Support for Window Functions
Initial (phase 1) Query runtime for window functions with ORDER BY within the OVER() clause (#10449)
Support for the ranking ROW_NUMBER() window function (#10527, #10587)
Set Operations Support
Support SetOperations (UNION, INTERSECT, MINUS) compilation in query planner (#10535)
Timestamp and Date Operations
Support TIMESTAMP type and date ops functions (#11350)
Aggregate Functions
Support more aggregation functions that are currently implementable (#11208)
Support multi-value aggregation functions (#11216)
Support Sketch based functions (#)
Make Intermediate Stage Worker Assignment Tenant Aware (#10617)
Evaluate literal expressions during query parsing, enabling more efficient query execution. (#11438)
Added support for partition parallelism in partitioned table scans, allowing for more efficient data retrieval (#11266).
[multistage]Adding more tuple sketch scalar functions and integration tests (#11517)

Multistage engine enhancements

Turn on v2 engine by default (#10543)
Introduced the ability to stream leaf stage blocks for more efficient data processing (#11472).
Early terminate SortOperator if there is a limit (#11334)
Implement ordering for SortExchange (#10408)
Table level Access Validation, QPS Quota, Phase Metrics for multistage queries (#10534)
Support partition based leaf stage processing (#11234)
Populate queryOption down to leaf (#10626)
Pushdown explain plan queries from the controller to the broker (#10505)
Enhanced the multi-stage group-by executor to support limiting the number of groups, improving query performance and resource utilization (#11424).
Improved resilience and reliability of the multi-stage join operator, now with added support for hash join right table protection (#11401).

Multistage engine bug fixes

Fix Predicate Pushdown by Using Rule Collection (#10409)
Try fixing mailbox cancel race condition (#10432)
Catch Throwable to Propagate Proper Error Message (#10438)
Fix tenant detection issues (#10546)
Handle Integer.MIN_VALUE in hashCode based FieldSelectionKeySelector (#10596)
Improve error message in case of non-existent table queried from the controller (#10599)
Derive SUM return type to be PostgreSQL compatible (#11151)

Index SPI

Add the ability to include new index types at runtime in Apache Pinot. This opens the ability of adding third party indexes, including proprietary indexes. More details here

Null value support for pinot queries

NULL support for ORDER BY, DISTINCT, GROUP BY, value transform functions and filtering.

Upsert enhancements

Delete support in upsert enabled tables (#10703)

Support added to extend upserts and allow deleting records from a realtime table. The design details can be found here.

Preload segments with upsert snapshots to speedup table loading (#11020)

Adds a feature to preload segments from a table that uses the upsert snapshot feature. The segments with validDocIds snapshots can be preloaded in a more efficient manner to speed up the table loading (thus server restarts).

TTL configs for upsert primary keys (#10915)

Adds support for specifying expiry TTL for upsert primary key metadata cleanup.

Segment compaction for upsert real-time tables (#10463)

Adds a new minion task to compact segments belonging to a real-time table with upserts.

Pinot Spark Connector for Spark3 (#10394)

  • Added spark3 support for Pinot Spark Connector (#10394)
  • Also added support to pass pinot query options to spark connector (#10443)

PinotDataBufferFactory and new PinotDataBuffer implementations (#10528)

Adds new implementations of PinotDataBuffer that uses Unsafe java APIs and foreign memory APIs. Also added support for PinotDataBufferFactory to allow plugging in custom PinotDataBuffer implementations.

Query functions enhancements

  • Add PercentileKLL aggregation function (#10643)
  • Support for ARG_MIN and ARG_MAX Functions (#10636)
  • refactor argmin/max to exprmin/max and make it calcite compliant (#11296)
  • Integer Tuple Sketch support (#10427)
  • Adding vector scalar functions (#11222)
  • [feature] multi-value datetime transform variants (#10841)
  • FUNNEL_COUNT Aggregation Function (#10867)
  • [multistage] Add support for RANK and DENSE_RANK ranking window functions (#10700)
  • add theta sketch scalar (#11153)
  • Register dateTimeConverter,timeConvert,dateTrunc, regexpReplace to v2 functions (#11097)
  • Add extract(quarter/dow/doy) support (#11388)
  • Funnel Count - Multiple Strategies (no partitioning requisites) (#11092)
  • Add Boolean assertion transform functions. (#11547)

JSON and CLP encoded message ingestion and querying

  • Add clpDecode transform function for decoding CLP-encoded fields. (#10885)
  • Add CLPDecodeRewriter to make it easier to call clpDecode with a column-group name rather than the individual columns. (#11006)
  • Add SchemaConformingTransformer to transform records with varying keys to fit a table's schema without dropping fields. (#11210)

Tier level index config override (#10553)

Allows overriding index configs at tier level, allowing for more flexible index configurations for different tiers.

Ingestion connectors and features

  • Kinesis stream header extraction (#9713)
  • Extract record keys, headers and metadata from Pulsar sources (#10995)
  • Realtime pre-aggregation for Distinct Count HLL & Big Decimal (#10926)
  • Added support to skip unparseable records in the csv record reader (#11487)
  • Null support for protobuf ingestion. (#11553)

UI enhancements

  • Adds persistence of authentication details in the browser session. This means that even if you refresh the app, you will still be logged in until the authentication session expires (#10389)
  • AuthProvider logic updated to decode the access token and extract user name and email. This information will now be available in the app for features to consume. (#10925)

Pinot docker image improvements and enhancements

  • Make Pinot base build and runtime images support Amazon Corretto and MS OpenJDK (#10422)
  • Support multi-arch pinot docker image (#10429)
  • Update dockerfile with recent jdk distro changes (#10963)

Operational improvements

Rebalance

  • Rebalance status API (#10359)
  • Tenant level rebalance API Tenant rebalance and status tracking APIs (#11128)

Config to use customized broker query thread pool (#10614)

Added new configuration options below which allow use of a bounded thread pool and allocate capacities for it.

pinot.broker.enable.bounded.http.async.executor
pinot.broker.http.async.executor.max.pool.size
pinot.broker.http.async.executor.core.pool.size
pinot.broker.http.async.executor.queue.size

This feature allows better management of broker resources.

Drop results support (#10419)

Adds a parameter to queryOptions to drop the resultTable from the response. This mode can be used to troubleshoot a query (which may have sensitive data in the result) using metadata only.

Make column order deterministic in segment (#10468)

In segment metadata and index map, store columns in alphabetical order so that the result is deterministic. Segments generated before/after this PR will have different CRC, so during the upgrade, we might get segments with different CRC from old and new consuming servers. For the segment consumed during the upgrade, some downloads might be needed.

Allow configuring helix timeouts for EV dropped in Instance manager (#10510)

Adds options to configure helix timeouts
external.view.dropped.max.wait.ms`` - The duration of time in milliseconds to wait for the external view to be dropped. Default - 20 minutes. external.view.check.interval.ms`` - The period in milliseconds in which to ping ZK for latest EV state.

Enable case insensitivity by default (#10771)

This PR makes Pinot case insensitive be default, and removes the deprecated property enable.case.insensitive.pql

Newly added APIs and client methods

  • Add Server API to get tenant pools (#11273)
  • Add new broker query point for querying multi-stage engine (#11341)
  • Add a new controller endpoint for segment deletion with a time window (#10758)
  • New API to get tenant tags (#10937)
  • Instance retag validation check api (#11077)
  • Use PUT request to enable/disable table/instance (#11109)
  • Update the pinot tenants tables api to support returning broker tagged tables (#11184)
  • Add requestId for BrokerResponse in pinot-broker and java-client (#10943)
  • Provide results in CompletableFuture for java clients and expose metrics (#10326)

Cleanup and backward incompatible changes

High level consumers are no longer supported

  • Cleanup HLC code (#11326)
  • Remove support for High level consumers in Apache Pinot (#11017)

Type information preservation of query literals

  • [feature] [backward-incompat] [null support # 2] Preserve null literal information in literal context and literal transform (#10380)
    String versions of numerical values are no longer accepted. For example, "123" won't be treated as a numerical anymore.

Controller job status ZNode path update

  • Moving Zk updates for reload, force_commit to their own Znodes which … (#10451)
    The status of previously completed reload jobs will not be available after this change is deployed.

Metric names for mutable indexes to change

  • Implement mutable index using index SPI (#10687)
    Due to a change in the IndexType enum used for some logs and metrics in mutable indexes, the metric names may change slightly.

Update in controller API to enable / disable / drop instances

  • Update getTenantInstances call for controller and separate POST operations on it (#10993)

Change in substring query function definition

  • Change substring to comply with standard sql definition (#11502)

Full list of features added

  • Allow queries on multiple tables of same tenant to be executed from controller UI #10336
  • Encapsulate changes in IndexLoadingConfig and SegmentGeneratorConfig #10352
  • [Index SPI] IndexType (#10191)
  • Simplify filtered aggregate transform operator creation (#10410)
  • Introduce BaseProjectOperator and ValueBlock (#10405)
  • Add support to create realtime segment in local (#10433)
  • Refactor: Pass context instead on individual arguments to operator (#10413)
  • Add "processAll" mode for MergeRollupTask (#10387)
  • Upgrade h2 version from 1.x to 2.x (#10456)
  • Added optional force param to the table configs update API (#10441)
  • Enhance broker reduce to handle different column names from server response (#10454)
  • Adding fields to enable/disable dictionary optimization. (#10484)
  • Remove converted H2 type NUMERIC(200, 100) from BIG_DECIMAL (#10483)
  • Add JOIN support to PinotQuery (#10421)
  • Add testng on verifier (#10491)
  • Clean up temp consuming segment files during server start (#10489)
  • make pinot k8s sts and deployment start command configurable (#10509)
  • Fix Bottleneck for Server Bootstrap by Making maxConnsPerRoute Configurable (#10487)
  • Type match between resultType and function's dataType (#10472)
  • create segment zk metadata cache (#10455)
  • Allow ValueBlock length to increase in TransformFunction (#10515)
  • Allow configuring helix timeouts for EV dropped in Instance manager (#10510)
  • Enhance error reporting (#10531)
  • Combine "GET /segments" API & "GET /segments/{tableName}/select" (#10412)
  • Exposed the CSV header map as part of CSVRecordReader (#10542)
  • Moving Zk updates for reload,force_commit to their own Znodes which will spread out Zk write load across jobTypes (#10451)
  • Enabling dictionary override optimization on the segment reload path as well. (#10557)
  • Make broker's rest resource packages configurable (#10588)
  • Check EV not exist before allowing creating the table (#10593)
  • Adding an parameter (toSegments) to the endSegmentReplacement API (#10630)
  • update target tier for segments if tierConfigs is provided (#10642)
  • Add support for custom compression factor for Percentile TDigest aggregation functions (#10649)
  • Utility to convert table config into updated format (#10623)
  • Segment lifecycle event listener support (#10536)
  • Add server metrics to capture gRPC activity (#10678)
  • Separate and parallelize BloomFilter based semgment pruner (#10660)
  • API to expose the contract/rules imposed by pinot on tableConfig #10655
  • Add description field to metrics in Pinot (#10744)
  • changing the dedup store to become pluggable #10639
  • Make the TimeUnit in the DATETRUNC function case insensitive. (#10750)
  • [feature] Consider tierConfigs when assigning new offline segment #10746
  • Compress idealstate according to estimated size #10766
  • 10689: Update for pinot helm release version 0.2.7 (#10723)
  • Fail the query if a filter's rhs contains NULL. (#11188)
  • Support Off Heap for Native Text Indices (#10842)
  • refine segment reload executor to avoid creating threads unbounded #10837
  • compress nullvector bitmap upon seal (#10852)
  • Enable case insensitivity by default (#10771)
  • Push out-of-order events metrics for full upsert (#10944)
  • [feature] add requestId for BrokerResponse in pinot-broker and java-client #10943
  • Provide results in CompletableFuture for java clients and expose metrics #10326
  • Add minion observability for segment upload/download failures (#10978)
  • Enhance early terminate for combine operator (#10988)
  • Add fromController method that accepts a PinotClientTransport (#11013)
  • Ensure min/max value generation in the segment metadata. (#10891)
  • Apply some allocation optimizations on GrpcSendingMailbox (#11015)
  • When enable case-insensitive, don't allow to add newly column name which have the same lowercase name with existed columns. (#10991)
  • Replace Long attributes with primitive values to reduce boxing (#11059)
  • retry KafkaConsumer creation in KafkaPartitionLevelConnectionHandler.java (#253) (#11040)
  • Support for new dataTime format in DateTimeGranularitySpec without explicitly setting size (#11057)
  • Returning 403 status code in case of authorization failures (#11136)
  • Simplify compatible test to avoid test against itself (#11163)
  • Updated code for setting value of segment min/max property. (#10990)
  • Add stat to track number of segments that have valid doc id snapshots (#11110)
  • Add brokerId and brokerReduceTimeMs to the broker response stats (#11142)
  • safely multiply integers to prevent overflow (#11186)
  • Move largest comparison value update logic out of map access (#11157)
  • Optimize DimensionTableDataManager to abort unnecesarry loading (#11192)
  • Refine isNullsLast and isAsc functions. (#11199)
  • Update the pinot tenants tables api to support returning broker tagged tables (#11184)
  • add multi-value support for native text index (#11204)
  • Add percentiles report in QuerySummary (#11299)
  • Add meter for broker responses with unavailable segments (#11301)
  • Enhance Minion task management (#11315)
  • add additional lucene index configs (#11354)
  • Add DECIMAL data type to orc record reader (#11377)
  • add configuration to fail server startup on non-good status checker (#11347)
  • allow passing freshness checker after an idle threshold (#11345)
  • Add broker validation for hybrid tableConfig creation (#7908)
  • Support partition parallelism for partitioned table scan (#11266)

Vulnerability fixes, bugfixes, cleanups and deprecations

  • Remove support for High level consumers in Apache Pinot (#11017)
  • Fix JDBC driver check for username (#10416)
  • [Clean up] Remove getColumnName() from AggregationFunction interface (#10431)
  • fix jersey TerminalWriterInterceptor MessageBodyWriter not found issue (#10462)
  • Bug fix: Start counting operator execution time from first NoOp block (#10450)
  • Fix unavailable instances issues for StrictReplicaGroup (#10466)
  • Change shell to bash (#10469)
  • Fix the double destroy of segment data manager during server shutdown (#10475)
  • Remove "isSorted()" precondition check in the ForwardIndexHandler (#10476)
  • Fix null handling in streaming selection operator (#10453)
  • Fix jackson dependencies (#10477)
  • Startree index build enhancement (#10905)
  • optimize queries where lhs and rhs of predicate are equal (#10444)
  • Trivial fix on a warning detected by static checker (#10492)
  • wait for full segment commit protocol on force commit (#10479)
  • Fix bug and add test for noDict -> Dict conversion for sorted column (#10497)
  • Make column order deterministic in segment (#10468)
  • Type match between resultType and function's dataType (#10472)
  • Allow empty segmentsTo for segment replacement protocol (#10511)
  • Use string as default compatible type for coalesce (#10516)
  • Use threadlocal variable for genericRow to make the MemoryOptimizedTable threadsafe (#10502)
  • Fix shading in spark2 connector pom file (#10490)
  • Fix ramping delay caused by long lasting sequence of unfiltered messa… (#10418)
  • Do not serialize metrics in each Operator (#10473)
  • Make pinot-controller apply webpack production mode when bin-dist profile is used. (#10525)
  • Fix FS props handling when using /ingestFromUri (#10480)
  • Clean up v0_deprecated batch ingestion jobs (#10532)
  • Deprecate kafka 0.9 support (#10522)
  • safely multiply integers to prevent overflow (#11186)
  • Reduce timeout for codecov and not fail the job in any case (#10547)
  • Fix DataTableV3 serde bug for empty array (#10583)
  • Do not record operator stats when tracing is enabled (#10447)
  • Forward auth token for logger APIs from controller to other controllers and brokers (#10590)
  • Bug fix: Partial upsert default strategy is null (#10610)
  • Fix flaky test caused by EV check during table creation (#10616)
  • Fix withDissabledTrue typo (#10624)
  • Cleanup unnecessary mailbox id ser/de (#10629)
  • no error metric for queries where all segments are pruned (#10589)
  • bug fix: to keep QueryParser thread safe when handling many read requests on class RealtimeLuceneTextIndex (#10620)
  • Fix static DictionaryIndexConfig.DEFAULT_OFFHEAP being actually onheap (#10632)
  • 10567: [cleanup pinot-integration-test-base], clean query generations and some other refactoring. (#10648)
  • Fixes backward incompatability with SegmentGenerationJobSpec for segment push job runners (#10645)
  • Bug fix to get the toSegments list correctly (#10659)
  • 10661: Fix for failing numeric comparison in where clause for IllegalStateException. (#10662)
  • Fixes partial upsert not reflecting multiple comparison column values (#10693)
  • Fix Bug in Reporting Timer Value for Min Consuming Freshness (#10690)
  • Fix typo of rowSize -> columnSize (#10699)
  • update segment target tier before table rebalance (#10695)
  • Fix a bug in star-tree filter operator which can incorrecly filter documents (#10707)
  • Enhance the instrumentation for a corner case where the query doesn't go through DocIdSetOp (#10729)
  • bug fix: add missing properties when edit instance config (#10741)
  • Making segmentMapper do the init and cleanup of RecordReader (#10874)
  • Fix githubEvents table for quickstart recipes (#10716)
  • Minor Realtime Segment Commit Upload Improvements (#10725)
  • Return 503 for all interrupted queries. Refactor the query killing code. (#10683)
  • Add decoder initialization error to the server's error cache (#10773)
  • bug fix: add @JsonProperty to SegmentAssignmentConfig (#10759)
  • ensure we wait the full no query timeout before shutting down (#10784)
  • Clean up KLL functions with deprecated convention (#10795)
  • Redefine the semantics of SEGMENT_STREAMED_DOWNLOAD_UNTAR_FAILURES metric to count individual segment fetch failures. (#10777)
  • fix excpetion during exchange routing causes stucked pipeline (#10802)
  • [bugfix] fix floating point and integral type backward incompatible issue (#10650)
  • [pinot-core] Start consumption after creating segment data manager (#11227)
  • Fix IndexOutOfBoundException in filtered aggregation group-by (#11231)
  • Fix null pointer exception in segment debug endpoint #11228
  • Clean up RangeIndexBasedFilterOperator. (#11219)
  • Fix the escape/unescape issue for property value in metadata (#11223)
  • Fix a bug in the order by comparator (#10818)
  • Keeps nullness attributes of merged in comparison column values (#10704)
  • Add required JSON annotation in H3IndexResolution (#10792)
  • Fix a bug in SELECT DISTINCT ORDER BY. (#10827)
  • jsonPathString should return null instead of string literal "null" (#10855)
  • Bug Fix: Segment Purger cannot purge old segments after schema evolution (#10869)
  • Fix #10713 by giving metainfo more priority than config (#10851)
  • Close PinotFS after Data Manager Shutdowns (#10888)
  • bump awssdk version for a bugfix on http conn leakage (#10898)
  • Fix MultiNodesOfflineClusterIntegrationTest.testServerHardFailure() (#10909)
  • Fix a bug in SELECT DISTINCT ORDER BY LIMIT. (#10887)
  • Fix an integer overflow bug. (#10940)
  • Return true when _resultSet is not null (#10899)
  • Fixing table name extraction for lateral join queries (#10933)
  • Fix casting when prefetching mmap'd segment larger than 2GB (#10936)
  • Null check before closing reader (#10954)
  • Fixes SQL wildcard escaping in LIKE queries (#10897)
  • [Clean up] Do not count DISTINCT as aggregation (#10985)
  • do not readd lucene readers to queue if segment is destroyed #10989
  • Message batch ingestion lag fix (#10983)
  • Fix a typo in snapshot lock (#11007)
  • When extracting root-level field name for complex type handling, use the whole delimiter (#11005)
  • update jersey to fix Denial of Service (DoS) (#11021)
  • Update getTenantInstances call for controller and separate POST operations on it (#10993)
  • update freemaker to fix Server-side Template Injection (#11019)
  • format double 0 properly to compare with h2 results (#11049)
  • Fix double-checked locking in ConnectionFactory (#11014)
  • Remove presto-pinot-driver and pinot-java-client-jdk8 module (#11051)
  • Make RequestUtils always return a string array when getTableNames (#11069)
  • Fix BOOL_AND and BOOL_OR result type (#11033)
  • [cleanup] Consolidate some query and controller/broker methods in integration tests (#11064)
  • Fix grpc regression on multi-stage engine (#11086)
  • Delete an obsolete TODO. (#11080)
  • Minor fix on AddTableCommand.toString() (#11082)
  • Allow using Lucene text indexes on mutable MV columns. (#11093)
  • Allow offloading multiple segments from same table in parallel (#11107)
  • Added serviceAccount to minion-stateless (#11095)
  • Bug fix: TableUpsertMetadataManager is null (#11129)
  • Fix reload bug (#11131)
  • Allow extra aggregation types in RealtimeToOfflineSegmentsTask (#10982)
  • Fix a bug when use range index to solve EQ predicate (#11146)
  • Sanitise API inputs used as file path variables (#11132)
  • Fix NPE when nested query doesn't have gapfill (#11155)
  • Fix the NPE when query response error stream is null (#11154)
  • Make interface methods non private, for java 8 compatibility (#11164)
  • Increment nextDocId even if geo indexing fails (#11158)
  • Fix the issue of consuming segment entering ERROR state due to stream connection errors (#11166)
  • In TableRebalancer, remove instance partitions only when reassigning instances (#11169)
  • Remove JDK 8 unsupported code (#11176)
  • Fix compat test by adding -am flag to build pinot-integration-tests (#11181)
  • dont duplicate register scalar function in CalciteSchema (#11190)
  • Fix the storage quota check for metadata push (#11193)
  • Delete filtering NULL support dead code paths. (#11198)
  • [bugfix] Do not move real-time segments to working dir on restart (#11226)
  • Fix a bug in ExpressionScanDocIdIterator for multi-value. (#11253)
  • Exclude NULLs when PredicateEvaluator::isAlwaysTrue is true. (#11261)
  • UI: fix sql query options seperator (#10770)
  • Fix a NullPointerException bug in ScalarTransformFunctionWrapper. (#11309)
  • [refactor] improve disk read for partial upsert handler (#10927)
  • Fix the wrong query time when the response is empty (#11349)
  • getMessageAtIndex should actually return the value in the streamMessage for compatibility (#11355)
  • Remove presto jdk8 related dependencies (#11285)
  • Remove special routing handling for multiple consuming segments (#11371)
  • Properly handle shutdown of TableDataManager (#11380)
  • Fixing the stale pinot ServerInstance in _tableTenantServersMap (#11386)
  • Fix the thread safety issue for mutable forward index (#11392)
  • Fix RawStringDistinctExecutor integer overflow (#11403)
  • [logging] fix consume rate logging bug to respect 1 minute threshold (#11421)