1.1.5升级到1.2.6后be退出

Viewed 31

集群信息:3台be,3台fe 稳定运行1年多。
异常信息:升级到 1.2.6后出现2次 be异常退出
be.out 信息如下

*** Query id: 405e89dcf9b40c4-aa38736608494dfe ***
*** Aborted at 1711590265 (unix time) try "date -d @1711590265" if you are using GNU date ***
*** Current BE git commitID: Unknown ***
*** SIGSEGV invalid permissions for mapped object (@0x7fb41db68000) received by PID 288359 (TID 0x7f9b27083700) from PID 498499584; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420
 1# os::Linux::chained_handler(int, siginfo*, void*) in /usr/local/jdk_current/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/local/jdk_current/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /usr/local/jdk_current/jre/lib/amd64/server/libjvm.so
 4# 0x00007FB6FBFD5400 in /lib64/libc.so.6
 5# _ZNSt8__detail9__variant17__gen_vtable_implINS0_12_Multi_arrayIPFNS0_21__deduce_visit_resultIN5doris6StatusEEEONS4_10vectorized8OverloadIJZNS7_12HashJoinNode20_process_build_blockEPNS4_12RuntimeStateERNS7_5BlockEhEUlRSt9monostateT_T0_E_ZNS9_20_process_build_blockESB_SD_hEUlOSG_SH_T1_E0_EEERSt7variantIJSE_NS7_26SerializedHashTableContextINS7_10RowRefListEEENS7_27PrimaryTypeHashTableContextIhSQ_EENSS_ItSQ_EENSS_IjSQ_EENSS_ImSQ_EENSS_INS7_7UInt128ESQ_EENSS_INS7_7UInt256ESQ_EENS7_24FixedKeyHashTableContextImLb1ESQ_EENS11_ImLb0ESQ_EENS11_ISX_Lb1ESQ_EENS11_ISX_Lb0ESQ_EENS11_ISZ_Lb1ESQ_EENS11_ISZ_Lb0ESQ_EENSP_INS7_18RowRefListWithFlagEEENSS_IhS18_EENSS_ItS18_EENSS_IjS18_EENSS_ImS18_EENSS_ISX_S18_EENSS_ISZ_S18_EENS11_ImLb1ES18_EENS11_ImLb0ES18_EENS11_ISX_Lb1ES18_EENS11_ISX_Lb0ES18_EENS11_ISZ_Lb1ES18_EENS11_ISZ_Lb0ES18_EENSP_INS7_19RowRefListWithFlagsEEENSS_IhS1M_EENSS_ItS1M_EENSS_IjS1M_EENSS_ImS1M_EENSS_ISX_S1M_EENSS_ISZ_S1M_EENS11_ImLb1ES1M_EENS11_ImLb0ES1M_EENS11_ISX_Lb1ES1M_EENS11_ISX_Lb0ES1M_EENS11_ISZ_Lb1ES1M_EENS11_ISZ_Lb0ES1M_EEEEOSO_IJSt17integral_constantIbLb0EES22_IbLb1EEEES26_EJEEESt16integer_sequenceImJLm9ELm0ELm0EEEE14__visit_invokeESN_S21_S26_S26_ at /var/local/ldb-toolchain/include/c++/11/variant:1015
 6# doris::vectorized::HashJoinNode::_process_build_block(doris::RuntimeState*, doris::vectorized::Block&, unsigned char) at /root/doris/be/src/vec/exec/join/vhash_join_node.cpp:933
 7# doris::vectorized::HashJoinNode::_materialize_build_side(doris::RuntimeState*) at /root/doris/be/src/vec/exec/join/vhash_join_node.cpp:710
 8# doris::vectorized::VJoinNodeBase::open(doris::RuntimeState*) at /root/doris/be/src/vec/exec/join/vjoin_node_base.cpp:203
 9# doris::vectorized::HashJoinNode::open(doris::RuntimeState*) at /root/doris/be/src/vec/exec/join/vhash_join_node.cpp:650
10# doris::vectorized::AggregationNode::open(doris::RuntimeState*) at /root/doris/be/src/vec/exec/vaggregation_node.cpp:458
11# doris::PlanFragmentExecutor::open_vectorized_internal() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:289
12# doris::PlanFragmentExecutor::open() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:261
13# doris::FragmentExecState::execute() at /root/doris/be/src/runtime/fragment_mgr.cpp:261
14# doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>) at /root/doris/be/src/runtime/fragment_mgr.cpp:508
15# std::_Function_handler<void (), doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::PlanFragmentExecutor*)>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:291
16# doris::ThreadPool::dispatch_thread() at /root/doris/be/src/util/threadpool.cpp:543
17# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:455
18# start_thread in /lib64/libpthread.so.0
19# __clone in /lib64/libc.so.6

通过 Tid 和审计日志,获取到异常时候最后一条sql, 单独提取sql 并执行,可以正常返回结果

2 Answers

去fe.audit.log找到这个对应query语句执行一下 405e89dcf9b40c4-aa38736608494dfe

执行了,返回了正常结果.

024-03-28 09:44:33,637 [slow_query] |Client=10.100.0.34:44316|User=youshu_readonly|Db=crm_report|State=ERR|ErrorCode=1105|ErrorMessage=errCode = 2, detailMessage = Cancelled|Time=7867|ScanBytes=0|ScanRows=0|ReturnRows=0|StmtId=1038193|QueryId=405e89dcf9b40c4-aa38736608494dfe|IsQuery=true|feIp=10.100.0.56|Stmt=/* By YouData (apiName:tableQuery)(userId:262)(uniqueId:lidaibin@haixue.com)(resourceId:c-1-2299-2489-l3dxymkz)(timestamp:1711590265740)(isEdit:false)(trigger:User)(mvName:undefined)(dataModelId:1291)(relatedResourceId:2299)(transId:fuv7irJHH3NHHvfsRy8Hd7)(queryId:0cf4c0b6166d7a16:fb0461cc00000000) */SELECT TIMESTAMP(DATE_FORMAT(t1.change_time, '%Y-%m-%d %H:%i:%s')) AS d0, t1.crm_chance_id AS d1, t1.user_name AS d2, t4.user_name AS d3, customTable_3_156.group_name AS d4, customTable_3_156.dist AS d5, customTable_3_156.center AS d6, t1.chance_source AS d7, TIMESTAMP(DATE_FORMAT(t2.quote_time, '%Y-%m-%d %H:%i:%s')) AS d8, SUM(t2.quote_amount) AS m0 FROM ( SELECT t1.user_id AS user_id, t1.group_id AS group_id, t1.chance_source AS chance_source, t1.user_name AS user_name, t1.crm_chance_id AS crm_chance_id, t1.change_time AS change_time FROM crm_report.ods_open_class AS t1 WHERE ((t1.saas_blocked_flag IN (0)) AND ((t1.change_time >= TIMESTAMP('2024-03-28 00:00:00')) AND (t1.change_time < TIMESTAMP('2024-03-29 00:00:00')))) ) AS t1 LEFT JOIN ( SELECT t2.saas_blocked_flag AS saas_blocked_flag, t2.crm_chance_id AS crm_chance_id, t2.quote_amount AS quote_amount, t2.quote_time AS quote_time FROM crm_report.ods_quote_amount_log AS t2 WHERE (t2.saas_blocked_flag IN (0)) ) AS t2 ON (t1.crm_chance_id <=> t2.crm_chance_id) INNER JOIN ( select group_id,group_name,model,leader,center,dist,gas from ods_org where date = regexp_replace(to_date(date_sub(now(),1)),'-','') and saas_blocked_flag=0 ) AS customTable_3_156 ON (t1.group_id <=> customTable_3_156.group_id) LEFT JOIN ( SELECT t4.saas_blocked_flag AS saas_blocked_flag, t4.user_id AS user_id, t4.user_name AS user_name FROM crm_report.dim_baxian_user AS t4 WHERE (t4.saas_blocked_flag IN (0)) ) AS t4 ON (t1.user_id <=> t4.user_id) WHERE ((t2.saas_blocked_flag IN (0)) AND (t4.saas_blocked_flag IN (0))) GROUP BY TIMESTAMP(DATE_FORMAT(t1.change_time, '%Y-%m-%d %H:%i:%s')), t1.crm_chance_id, t1.user_name, t4.user_name, customTable_3_156.group_name, customTable_3_156.dist, customTable_3_156.center, t1.chance_source, TIMESTAMP(DATE_FORMAT(t2.quote_time, '%Y-%m-%d %H:%i:%s')) ORDER BY (TIMESTAMP(DATE_FORMAT(t1.change_time, '%Y-%m-%d %H:%i:%s')) IS NULL) ASC, TIMESTAMP(DATE_FORMAT(t1.change_time, '%Y-%m-%d %H:%i:%s')) DESC LIMIT 50|CpuTimeMS=0|SqlHash=0d79c1d0ed37cad20acdc81a37e2a01c|peakMemoryBytes=0|SqlDigest=|TraceId=|FuzzyVariables=