doris-2.0.2升级doris-2.1.6 BE节点启动时crash

Viewed 14

从2.0.2升级到2.1.6,选择一台节点做灰度升级,启动后crash,系统日志打印"abrt-hook-cpp: Process killed by SIGSEGV - dumping core",使用ulimit -c unlimited -n 65536 && sh start_be.sh --daemon方式启动,两分钟crash,输出的core.文件有27GB。
be.out答应的query_id是一个很普通的内表查询,多次启动都会crash,query_id对应的查询语句不同。
be.out如下

StdoutLogger 2024-12-25 18:17:10,237 Start time: Wed Dec 25 18:17:10 CST 2024
INFO: java_cmd /usr/java/latest/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/tiankx/doris-2.1.6/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/tiankx/doris-2.1.6/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/tiankx/doris-2.1.6/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
*** Query id: bd77ff401989437a-8550d5bdd0d14507 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1735121953 (unix time) try "date -d @1735121953" if you are using GNU date ***
*** Current BE git commitID: 653e315ba5 ***
*** SIGSEGV address not mapped to object (@0x8) received by PID 23778 (TID 24195 OR 0x7f586cdf3700) from PID 8; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_release/doris/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo*, void*) in /usr/java/latest/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/java/latest/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /usr/java/latest/jre/lib/amd64/server/libjvm.so
 4# 0x00007F61839B4400 in /lib64/libc.so.6
 5# __memset_sse2 in /lib64/libc.so.6
 6# doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 15ul, 16ul>::resize_fill(unsigned long, unsigned char const&) at /home/zcp/repo_center/doris_release/doris/be/src/vec/common/pod_array.h:377
 7# doris::vectorized::ColumnFilterHelper::resize_fill(unsigned long, unsigned char) at /home/zcp/repo_center/doris_release/doris/be/src/vec/columns/column_filter_helper.cpp:28
 8# doris::vectorized::VNestedLoopJoinNode::_append_left_data_with_null(doris::vectorized::Block&) const in /home/tiankx/doris-2.1.6/be/lib/doris_be
 9# doris::Status doris::vectorized::VNestedLoopJoinNode::_generate_join_block_data<std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)5>, false, false>(doris::RuntimeState*, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)5>&) in /home/tiankx/doris-2.1.6/be/lib/doris_be
10# std::__detail::__variant::__gen_vtable_impl<std::__detail::__variant::_Multi_array<std::__detail::__variant::__deduce_visit_result<doris::Status> (*)(doris::vectorized::VNestedLoopJoinNode::push(doris::RuntimeState*, doris::vectorized::Block*, bool)::$_0&, std::variant<std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)0>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)2>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)8>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)1>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)4>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)3>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)5>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)7>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)9>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)10>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)11> >&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true> >&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true> >&&)>, std::integer_sequence<unsigned long, 6ul, 0ul, 0ul> >::__visit_invoke(doris::vectorized::VNestedLoopJoinNode::push(doris::RuntimeState*, doris::vectorized::Block*, bool)::$_0&, std::variant<std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)0>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)2>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)8>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)1>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)4>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)3>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)5>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)7>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)9>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)10>, std::integral_constant<doris::TJoinOp::type, (doris::TJoinOp::type)11> >&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true> >&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true> >&&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1013
11# doris::vectorized::VNestedLoopJoinNode::push(doris::RuntimeState*, doris::vectorized::Block*, bool) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/join/vnested_loop_join_node.cpp:245
12# doris::pipeline::StatefulOperator<doris::vectorized::VNestedLoopJoinNode>::get_block(doris::RuntimeState*, doris::vectorized::Block*, doris::pipeline::SourceState&) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/operator.h:431
13# doris::pipeline::StatefulOperator<doris::vectorized::HashJoinNode>::get_block(doris::RuntimeState*, doris::vectorized::Block*, doris::pipeline::SourceState&) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/operator.h:425
14# doris::pipeline::PipelineTask::execute(bool*) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_task.cpp:300
15# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/task_scheduler.cpp:347
16# doris::ThreadPool::dispatch_thread() in /home/tiankx/doris-2.1.6/be/lib/doris_be
17# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_release/doris/be/src/util/thread.cpp:499
18# start_thread in /lib64/libpthread.so.0
19# clone in /lib64/libc.so.6
1 Answers

已私,先从2.0.2直升至2.1.7尝试。