问题背景,昨天将集群从2.0升级到2.1,发现某些通过数据湖查询的语句一定会导致BE重启
【旧问题】Doris 2.1.0 执行查询导致BE重启
今天发现,只要是大批量执行数据湖查询语句,也存在BE触发SIGSEGV导致节点关闭的可能:
大批量运行语句:
SELECT
fop.plan_actual_qty as plan_actual_qty,
fop.often_box_qty as box_spec
FROM
HYY_DW_MYSQL.hyy.fs_whole_cabinet_track fwct left join
HYY_DW_MYSQL.hyy.fs_oversea_plan fop on fwct.cabinet_times = fop.actual_which_number
where
fwct.status != ? and fop.fnsku = ? and fop.actual_which_number = ?
被杀死的节点BE.out:
Call to AttachCurrentThread failed with error: -1
getJNIEnv: getGlobalJNIEnv failed
*** Query id: f556072bee2045ce-83354dd7e9f9b941 ***
*** tablet id: 0 ***
*** Aborted at 1711446334 (unix time) try "date -d @1711446334" if you are using GNU date ***
*** Current BE git commitID: 91efb6a43d ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 2139870 (TID 2141393 OR 0x7fd7cced3640) from PID 0; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_release/doris/be/src/common/signal_handler.h:417
1# 0x00007FDB88E0042F in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
3# 0x00007FDB88DF90FC in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
4# 0x00007FDB8E316520 in /lib/x86_64-linux-gnu/libc.so.6
5# doris::vectorized::JdbcConnector::close(doris::Status) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/vjdbc_connector.cpp:88
6# doris::vectorized::NewJdbcScanner::close(doris::RuntimeState*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/new_jdbc_scanner.cpp:207
7# doris::vectorized::ScannerDelegate::~ScannerDelegate() at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vscan_node.h:98
8# doris::pipeline::ScanLocalState<doris::pipeline::JDBCScanLocalState>::~ScanLocalState() at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/scan_operator.h:138
9# doris::pipeline::JDBCScanLocalState::~JDBCScanLocalState() at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/jdbc_scan_operator.h:40
10# doris::RuntimeState::~RuntimeState() at /home/zcp/repo_center/doris_release/doris/be/src/runtime/runtime_state.cpp:257
11# doris::pipeline::PipelineXFragmentContext::~PipelineXFragmentContext() at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_x/pipeline_x_fragment_context.cpp:113
12# doris::pipeline::ExchangeSinkBuffer<doris::pipeline::ExchangeSinkLocalState>::_send_rpc(long)::{lambda(long const&, bool const&, doris::PTransmitDataResult const&, long const&)#2}::operator()(long const&, bool const&, doris::PTransmitDataResult const&, long const&) const at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/exchange_sink_buffer.cpp:392
13# doris::pipeline::ExchangeSendCallback<doris::PTransmitDataResult>::call() at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/exchange_sink_buffer.h:181
14# doris::AutoReleaseClosure<doris::PTransmitDataParams, doris::pipeline::ExchangeSendCallback<doris::PTransmitDataResult> >::Run() at /home/zcp/repo_center/doris_release/doris/be/src/util/ref_count_closure.h:91
15# brpc::Controller::EndRPC(brpc::Controller::CompletionInfo const&) in /opt/apache-doris/be/lib/doris_be
16# brpc::policy::ProcessRpcResponse(brpc::InputMessageBase*) in /opt/apache-doris/be/lib/doris_be
17# brpc::ProcessInputMessage(void*) in /opt/apache-doris/be/lib/doris_be
18# brpc::InputMessenger::InputMessageClosure::~InputMessageClosure() in /opt/apache-doris/be/lib/doris_be
19# brpc::InputMessenger::OnNewMessages(brpc::Socket*) in /opt/apache-doris/be/lib/doris_be
20# brpc::Socket::ProcessEvent(void*) in /opt/apache-doris/be/lib/doris_be
21# bthread::TaskGroup::task_runner(long) in /opt/apache-doris/be/lib/doris_be
22# bthread_make_fcontext in /opt/apache-doris/be/lib/doris_be
改成数据同步到doris后,大批量进行本地查询则没有问题
相关问题在2.0版本未有出现。