Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [fix](planner) fix core when select and filter by slot in old planner #46541 #46637

Merged
merged 1 commit into from
Jan 8, 2025

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Jan 8, 2025

Cherry-picked from #46541

…#46541)

### What problem does this PR solve?

Problem Summary:
In 2.1.7, if a sql parse failed in nereids planner, it will fallback to old planner.
and old planner maybe create `SelectNode` with some conjuncts that are
not boolean type.

Reproduce:
Create table sql see #46498
query sql1:
select b.c_id   from 
dbgr as b  
left join  
(select c.c_id from lo  where event_date between 20220500 and 20220600 limit 100 )c    
on c.c_id LIMIT  200;

query sql2:
select b.c_id  
from dbgr as b  
left join   
(select c.c_id from lo )c  
on c.c_id  
LIMIT 0, 200; 

Because `select c.c_id`, these sqls will fallback to old planner.
Because `on c.c_id` is not boolean type, and be will core.
A part of query plan is as follows:

|   1:VOlapScanNode                                                           |
|      TABLE: test.lo(lo), PREAGGREGATION: ON                                 |
|      PREDICATES: (`c`.`c_id` AND (`test`.`lo`.`__DORIS_DELETE_SIGN__` = 0)) |
|      partitions=1/3 (p_202206)                                              |
|      tablets=2/2, tabletList=89678,89680                                    |
|      cardinality=46, avgRowSize=165.54349, numNodes=1                       |
|      pushAggOp=NONE                                                         |
+-----------------------------------------------------------------------------+

A fatal log is as follows:

F20241219 23:13:23.457937 33282 assert_cast.h:58] Bad cast from type:doris::vectorized::ColumnVector<int> to doris::vectorized::ColumnVector<unsigned 
char>
*** Check failure stack trace: ***
    @     0x55bfa043b956  google::LogMessageFatal::~LogMessageFatal()
    @     0x55bf6f3bc070  assert_cast<>()
    @     0x55bf8978d767  doris::vectorized::VExprContext::execute_conjuncts()
    @     0x55bf8978c463  doris::vectorized::VExprContext::execute_conjuncts_and_filter_block()
    @     0x55bf8978bf72  doris::vectorized::VExprContext::filter_block()
    @     0x55bfa035b8e4  doris::pipeline::SelectOperatorX::pull()
    @     0x55bf9fee2b62  doris::pipeline::StreamingOperatorX<>::get_block()
    @     0x55bf9feab54b  doris::pipeline::OperatorXBase::get_block_after_projects()
    @     0x55bfa03dd07c  doris::pipeline::PipelineXTask::execute()
    @     0x55bfa0413e85  doris::pipeline::TaskScheduler::_do_work()
    @     0x55bfa0417dcb  doris::pipeline::TaskScheduler::start()::$_0::operator()()
    @     0x55bfa0417d55  std::__invoke_impl<>()
    @     0x55bfa0417d05  _ZSt10__invoke_rIvRZN5doris8pipeline13TaskScheduler5startEvE3$_0JEENSt9enable_ifIX16is_invocable_r_vIT_T0_DpT1_EES6_E4typeEO
S7_DpOS8_
    @     0x55bfa0417bcd  std::_Function_handler<>::_M_invoke()
    @     0x55bf6e6c9b63  std::function<>::operator()()
    @     0x55bf7289e209  doris::FunctionRunnable::run()
    @     0x55bf728899c0  doris::ThreadPool::dispatch_thread()
    @     0x55bf728b0c24  std::__invoke_impl<>()
    @     0x55bf728b0afd  std::__invoke<>()
    @     0x55bf728b0a85  _ZNSt5_BindIFMN5doris10ThreadPoolEFvvEPS1_EE6__callIvJEJLm0EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
    @     0x55bf728b092e  std::_Bind<>::operator()<>()
    @     0x55bf728b0845  std::__invoke_impl<>()
    @     0x55bf728b07e5  _ZSt10__invoke_rIvRSt5_BindIFMN5doris10ThreadPoolEFvvEPS2_EEJEENSt9enable_ifIX16is_invocable_r_vIT_T0_DpT1_EESA_E4typeEOSB_D
pOSC_
    @     0x55bf728b048d  std::_Function_handler<>::_M_invoke()
    @     0x55bf6e6c9b63  std::function<>::operator()()
    @     0x55bf728521fc  doris::Thread::supervise_thread()
    @     0x7f4260614ea5  start_thread
    @     0x7f42610439fd  __clone
    @              (nil)  (unknown)

And another:

F20250108 13:07:05.275424 184257 assert_cast.h:58] Bad cast from type:doris::vectorized::ColumnVector<int> to doris::vectorized::ColumnVector<unsigned char>
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/tc/be/src/common/signal_handler.h:421
 1# 0x00007FB73FB31400 in /lib64/libc.so.6
 2# __GI_raise in /lib64/libc.so.6
 3# abort in /lib64/libc.so.6
 4# 0x000055CDAAF0090D in /usr/local/service/doris/lib/be/doris_be
 5# google::LogMessage::SendToLog() in /usr/local/service/doris/lib/be/doris_be
 6# google::LogMessage::Flush() in /usr/local/service/doris/lib/be/doris_be
 7# google::LogMessageFatal::~LogMessageFatal() in /usr/local/service/doris/lib/be/doris_be
 8# doris::vectorized::ColumnVector<unsigned char> const& assert_cast<doris::vectorized::ColumnVector<unsigned char> const&, doris::vec
torized::IColumn const&>(doris::vectorized::IColumn const&) in /usr/local/service/doris/lib/be/doris_be
 9# doris::vectorized::VExprContext::execute_conjuncts(std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std
::shared_ptr<doris::vectorized::VExprContext> > > const&, std::vector<doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<fals
e, false, false, DefaultMemoryAllocator>, 16ul, 16ul>*, std::allocator<doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<fal
se, false, false, DefaultMemoryAllocator>, 16ul, 16ul>*> > const*, bool, doris::vectorized::Block*, doris::vectorized::PODArray<unsigne
d char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul>*, bool*) at /root/tc/be/src/vec/exprs/vexpr_context
.cpp:181
10# doris::vectorized::VExprContext::execute_conjuncts_and_filter_block(std::vector<std::shared_ptr<doris::vectorized::VExprContext>, s
td::allocator<std::shared_ptr<doris::vectorized::VExprContext> > > const&, doris::vectorized::Block*, std::vector<unsigned int, std::al
locator<unsigned int> >&, int, doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator
>, 16ul, 16ul>&) at /root/tc/be/src/vec/exprs/vexpr_context.cpp:324
11# doris::segment_v2::SegmentIterator::_execute_common_expr(unsigned short*, unsigned short&, doris::vectorized::Block*) at /root/tc/b
e/src/olap/rowset/segment_v2/segment_iterator.cpp:2274
12# doris::segment_v2::SegmentIterator::_next_batch_internal(doris::vectorized::Block*) at /root/tc/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2178
13# doris::segment_v2::SegmentIterator::next_batch(doris::vectorized::Block*)::$_0::operator()() const at /root/tc/be/src/olap/rowset/segment_v2/segment_iterator.cpp:1914
14# doris::segment_v2::SegmentIterator::next_batch(doris::vectorized::Block*) at /root/tc/be/src/olap/rowset/segment_v2/segment_iterator.cpp:1913
15# doris::segment_v2::LazyInitSegmentIterator::next_batch(doris::vectorized::Block*) in /usr/local/service/doris/lib/be/doris_be
16# doris::BetaRowsetReader::next_block(doris::vectorized::Block*) at /root/tc/be/src/olap/rowset/beta_rowset_reader.cpp:348
17# doris::vectorized::VCollectIterator::Level0Iterator::_refresh() in /usr/local/service/doris/lib/be/doris_be
18# doris::vectorized::VCollectIterator::Level0Iterator::refresh_current_row() at /root/tc/be/src/vec/olap/vcollect_iterator.cpp:511
19# doris::vectorized::VCollectIterator::Level0Iterator::ensure_first_row_ref() at /root/tc/be/src/vec/olap/vcollect_iterator.cpp:482
20# doris::vectorized::VCollectIterator::Level1Iterator::ensure_first_row_ref() at /root/tc/be/src/vec/olap/vcollect_iterator.cpp:697
21# doris::vectorized::VCollectIterator::build_heap(std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > >&) at /root/tc/be/src/vec/olap/vcollect_iterator.cpp:186
22# doris::vectorized::BlockReader::_init_collect_iter(doris::TabletReader::ReaderParams const&) at /root/tc/be/src/vec/olap/block_reader.cpp:139
23# doris::vectorized::BlockReader::init(doris::TabletReader::ReaderParams const&) at /root/tc/be/src/vec/olap/block_reader.cpp:211
24# doris::vectorized::NewOlapScanner::open(doris::RuntimeState*) at /root/tc/be/src/vec/exec/scan/new_olap_scanner.cpp:227
25# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /root/tc/be/src/vec/exec/scan/scanner_scheduler.cpp:259
26# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_1::operator()() const::{lambda()#1}::operator()() const::{lambda()#2}::operator()() const at /root/tc/be/src/vec/exec/scan/scanner_scheduler.cpp:180
...

---------

Co-authored-by: liutang123 <[email protected]>
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Jan 8, 2025
@dataroaring dataroaring reopened this Jan 8, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40759 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 003ab903fd795b1acfb8166d9869c8e51d693850, data reload: false

------ Round 1 ----------------------------------
q1	17597	7837	7180	7180
q2	2066	181	175	175
q3	10543	1071	1140	1071
q4	10529	775	716	716
q5	7744	2820	2814	2814
q6	239	147	143	143
q7	961	610	605	605
q8	9360	1953	2005	1953
q9	6522	6398	6380	6380
q10	7014	2278	2320	2278
q11	476	272	259	259
q12	406	209	207	207
q13	17781	2985	2981	2981
q14	248	207	206	206
q15	563	533	518	518
q16	702	631	624	624
q17	972	599	591	591
q18	7259	6738	6647	6647
q19	1396	1065	1063	1063
q20	455	200	196	196
q21	3966	3160	3183	3160
q22	1103	992	997	992
Total cold run time: 107902 ms
Total hot run time: 40759 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7288	7916	7272	7272
q2	326	238	233	233
q3	2928	2941	2951	2941
q4	2049	1825	1790	1790
q5	5698	5768	5695	5695
q6	223	147	142	142
q7	2263	1830	1834	1830
q8	3322	3539	3486	3486
q9	8845	8879	8809	8809
q10	3564	3552	3516	3516
q11	607	512	493	493
q12	799	628	612	612
q13	9040	3191	3138	3138
q14	294	288	266	266
q15	590	524	532	524
q16	728	674	670	670
q17	1840	1625	1603	1603
q18	8282	7657	7544	7544
q19	1653	1673	1442	1442
q20	2112	1871	1860	1860
q21	5528	5329	5275	5275
q22	1109	1036	1016	1016
Total cold run time: 69088 ms
Total hot run time: 60157 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198557 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 003ab903fd795b1acfb8166d9869c8e51d693850, data reload: false

query1	1350	926	923	923
query2	6248	2084	2052	2052
query3	10860	4216	4300	4216
query4	65776	29180	23549	23549
query5	4987	456	441	441
query6	403	171	174	171
query7	5542	313	312	312
query8	324	236	235	235
query9	8600	2698	2688	2688
query10	462	272	260	260
query11	17097	15157	15759	15157
query12	158	103	116	103
query13	1471	452	438	438
query14	9946	7536	7475	7475
query15	216	192	182	182
query16	7254	506	501	501
query17	1105	611	616	611
query18	1862	336	356	336
query19	222	156	167	156
query20	118	113	110	110
query21	215	104	106	104
query22	4805	4532	4510	4510
query23	34561	34388	34347	34347
query24	6214	2929	2904	2904
query25	528	442	448	442
query26	654	167	173	167
query27	1790	357	364	357
query28	4140	2505	2483	2483
query29	730	466	464	464
query30	239	167	166	166
query31	1026	845	830	830
query32	66	58	59	58
query33	420	286	290	286
query34	917	521	515	515
query35	829	741	766	741
query36	1087	973	978	973
query37	118	73	74	73
query38	4128	4025	4037	4025
query39	1573	1463	1484	1463
query40	211	108	104	104
query41	49	48	46	46
query42	117	102	103	102
query43	535	496	497	496
query44	1177	835	841	835
query45	187	169	168	168
query46	1154	719	761	719
query47	2052	1953	1946	1946
query48	472	387	397	387
query49	738	405	401	401
query50	819	432	424	424
query51	7295	7095	7193	7095
query52	103	88	89	88
query53	259	181	184	181
query54	562	453	454	453
query55	75	69	74	69
query56	264	236	236	236
query57	1251	1141	1130	1130
query58	217	203	204	203
query59	3173	3001	3031	3001
query60	280	249	245	245
query61	110	149	119	119
query62	765	654	663	654
query63	214	188	201	188
query64	1366	676	652	652
query65	3273	3202	3162	3162
query66	707	310	304	304
query67	16106	15650	15744	15650
query68	4077	592	565	565
query69	440	272	265	265
query70	1217	1150	1080	1080
query71	353	260	257	257
query72	6365	4018	4016	4016
query73	760	347	344	344
query74	10058	9025	8986	8986
query75	3390	2635	2700	2635
query76	1899	1081	1114	1081
query77	502	275	279	275
query78	10706	9706	9620	9620
query79	1296	594	588	588
query80	844	455	432	432
query81	516	243	234	234
query82	1310	115	115	115
query83	260	144	145	144
query84	290	78	81	78
query85	890	303	295	295
query86	343	299	308	299
query87	4430	4308	4404	4308
query88	3535	2409	2357	2357
query89	417	292	292	292
query90	2029	184	186	184
query91	177	148	149	148
query92	65	52	53	52
query93	1304	548	570	548
query94	801	296	285	285
query95	348	250	250	250
query96	616	280	281	280
query97	3332	3193	3204	3193
query98	209	199	195	195
query99	1581	1321	1274	1274
Total cold run time: 315932 ms
Total hot run time: 198557 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.17 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 003ab903fd795b1acfb8166d9869c8e51d693850, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.03	0.03
query3	0.23	0.06	0.06
query4	1.62	0.10	0.10
query5	0.54	0.52	0.53
query6	1.13	0.74	0.73
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.58	0.50	0.49
query10	0.55	0.58	0.56
query11	0.14	0.10	0.11
query12	0.14	0.12	0.11
query13	0.62	0.59	0.60
query14	3.05	2.89	3.02
query15	0.90	0.84	0.81
query16	0.39	0.38	0.36
query17	1.09	1.09	1.05
query18	0.23	0.22	0.21
query19	1.86	1.76	1.88
query20	0.01	0.00	0.01
query21	15.36	0.60	0.55
query22	2.87	2.84	1.47
query23	17.09	0.94	0.85
query24	3.27	0.69	1.20
query25	0.18	0.06	0.07
query26	0.61	0.14	0.14
query27	0.04	0.05	0.04
query28	10.82	1.10	1.07
query29	12.59	3.24	3.22
query30	0.25	0.06	0.05
query31	2.85	0.40	0.40
query32	3.22	0.45	0.45
query33	2.98	3.05	3.01
query34	17.15	4.47	4.50
query35	4.55	4.49	4.48
query36	0.69	0.48	0.48
query37	0.09	0.06	0.06
query38	0.05	0.03	0.03
query39	0.03	0.02	0.02
query40	0.17	0.14	0.13
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 108.24 s
Total hot run time: 32.17 s

@yiguolei yiguolei merged commit 9e109ee into branch-3.0 Jan 8, 2025
21 of 22 checks passed
@github-actions github-actions bot deleted the auto-pick-46541-branch-3.0 branch January 8, 2025 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants