be节点不定时crash

Viewed 70

doris的版本,基于官方的1.2.8版本自己编译的,不过使用官方的版本仍然会有这个错误,见图片使用官方版本仍然会有这个错误

doris-1.2.8-rc01(AVX2) RELEASE (build file://app1@Unknown)
Built on Wed, 14 Aug 2024 09:19:26 CST by app1

java版本

openjdk version "1.8.0_332"
OpenJDK Runtime Environment (Temurin)(build 1.8.0_332-b09)
OpenJDK 64-Bit Server VM (Temurin)(build 25.332-b09, mixed mode)

be.out中的报错信息

start time: Tue Aug 20 10:12:12 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
*** Query id: e4afd0f56fde4c8f-a4bf09a82949d065 ***
*** Aborted at 1724123375 (unix time) try "date -d @1724123375" if you are using GNU date ***
*** Current BE git commitID: Unknown ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 24584 (TID 0x7fe4a625f700) from PID 0; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /opt/module/doris/be/src/common/signal_handler.h:420
 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 4# 0x00007FE5B5BAC400 in /lib64/libc.so.6
 5# je_arena_dalloc_promoted at ../src/arena.c:1604
 6# __GI__dl_deallocate_tls in /lib64/ld-linux-x86-64.so.2
 7# __free_stacks in /lib64/libpthread.so.0
 8# __deallocate_stack in /lib64/libpthread.so.0
 9# pthread_join in /lib64/libpthread.so.0
10# std::thread::join() in /opt/module/be/lib/doris_be
11# doris::KafkaDataConsumerGroup::start_all(doris::StreamLoadContext*) at /opt/module/doris/be/src/runtime/routine_load/data_consumer_group.cpp:141
12# doris::RoutineLoadTaskExecutor::exec_task(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>) at /opt/module/doris/be/src/runtime/routine_load/routine_load_task_executor.cpp:306
13# std::_Function_handler<void (), std::_Bind_result<void, void (doris::RoutineLoadTaskExecutor::*(doris::RoutineLoadTaskExecutor*, doris::StreamLoadContext*, doris::DataConsumerPool*, doris::RoutineLoadTaskExecutor::submit_task(doris::TRoutineLoadTask const&)::{lambda(doris::StreamLoadContext*)#1}))(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>)> >::_M_invoke(std::_Any_data const&) at /opt/module/ldb_toolchain/include/c++/11/bits/std_function.h:291
14# doris::PriorityThreadPool::work_thread(int) at /opt/module/doris/be/src/util/priority_thread_pool.hpp:146
15# execute_native_thread_routine in /opt/module/be/lib/doris_be
16# start_thread in /lib64/libpthread.so.0
17# clone in /lib64/libc.so.6

start time: Tue Aug 20 11:10:27 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
*** Query id: 7c4167abfa7543f0-a2d55920cca1cf25 ***
*** Aborted at 1724126974 (unix time) try "date -d @1724126974" if you are using GNU date ***
*** Current BE git commitID: Unknown ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 20269 (TID 0x7f6e09cc8700) from PID 0; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /opt/module/doris/be/src/common/signal_handler.h:420
 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 4# 0x00007F6F200B2400 in /lib64/libc.so.6
 5# je_arena_dalloc_promoted at ../src/arena.c:1604
 6# __GI__dl_deallocate_tls in /lib64/ld-linux-x86-64.so.2
 7# __free_stacks in /lib64/libpthread.so.0
 8# __deallocate_stack in /lib64/libpthread.so.0
 9# pthread_join in /lib64/libpthread.so.0
10# std::thread::join() in /opt/module/be/lib/doris_be
11# doris::KafkaDataConsumerGroup::start_all(doris::StreamLoadContext*) at /opt/module/doris/be/src/runtime/routine_load/data_consumer_group.cpp:141
12# doris::RoutineLoadTaskExecutor::exec_task(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>) at /opt/module/doris/be/src/runtime/routine_load/routine_load_task_executor.cpp:306
13# std::_Function_handler<void (), std::_Bind_result<void, void (doris::RoutineLoadTaskExecutor::*(doris::RoutineLoadTaskExecutor*, doris::StreamLoadContext*, doris::DataConsumerPool*, doris::RoutineLoadTaskExecutor::submit_task(doris::TRoutineLoadTask const&)::{lambda(doris::StreamLoadContext*)#1}))(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>)> >::_M_invoke(std::_Any_data const&) at /opt/module/ldb_toolchain/include/c++/11/bits/std_function.h:291
14# doris::PriorityThreadPool::work_thread(int) at /opt/module/doris/be/src/util/priority_thread_pool.hpp:146
15# execute_native_thread_routine in /opt/module/be/lib/doris_be
16# start_thread in /lib64/libpthread.so.0
17# clone in /lib64/libc.so.6

start time: Tue Aug 20 12:10:25 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
*** Query id: 0-0 ***
*** Aborted at 1724130832 (unix time) try "date -d @1724130832" if you are using GNU date ***
*** Current BE git commitID: Unknown ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 80805 (TID 0x7fe6d8ffe700) from PID 0; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /opt/module/doris/be/src/common/signal_handler.h:420
 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 4# 0x00007FE9A3A36400 in /lib64/libc.so.6
 5# je_arena_dalloc_promoted at ../src/arena.c:1604
 6# arena_dalloc_no_tcache at ../include/jemalloc/internal/arena_inlines_b.h:265
 7# je_free_default at ../src/jemalloc.c:2799
 8# __GI__dl_deallocate_tls in /lib64/ld-linux-x86-64.so.2
 9# __free_stacks in /lib64/libpthread.so.0
10# __deallocate_stack in /lib64/libpthread.so.0
11# start_thread in /lib64/libpthread.so.0
12# clone in /lib64/libc.so.6

start time: Tue Aug 20 13:13:56 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
*** Query id: d637eb0835b54819-b9f9851440dcdca0 ***
*** Aborted at 1724137733 (unix time) try "date -d @1724137733" if you are using GNU date ***
*** Current BE git commitID: Unknown ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 44745 (TID 0x7f2a687ff700) from PID 0; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /opt/module/doris/be/src/common/signal_handler.h:420
 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so
 4# 0x00007F396CBFE400 in /lib64/libc.so.6
 5# jemalloc_usable_size at ../src/jemalloc.c:3740
 6# free at /opt/module/doris/be/src/runtime/memory/jemalloc_hook.cpp:43
 7# __GI__dl_deallocate_tls in /lib64/ld-linux-x86-64.so.2
 8# __free_stacks in /lib64/libpthread.so.0
 9# __deallocate_stack in /lib64/libpthread.so.0
10# start_thread in /lib64/libpthread.so.0
11# clone in /lib64/libc.so.6

start time: Tue Aug 20 15:08:58 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
1 Answers

1.2的问题现在不好追了,建议用官方的2.0或者2.1最新的测试下吧,我看有几个堆栈式关于routineload的,这块之前修过。您可以升级测试下的。