be 内存溢出,自动退出

Viewed 71

用户在执行大SQL时,导致内存爆掉,be直接退出,希望的效果是查询大SQL导致内存不足时,可以取消SQL甚至报错,不要直接退出

doris版本:2.0.4

报错日志如下:
start time: Wed Jun 12 14:54:04 CST 2024
INFO: java_cmd /usr/java/jdk1.8.0_291-amd64/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/apps/doris-2.0.4-bin-x64/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/apps/doris-2.0.4-bin-x64/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/apps/doris-2.0.4-bin-x64/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /data/apps/doris-2.0.4-bin-x64/be/lib/hadoop_hdfs/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'.
terminate called after throwing an instance of 'doris::Exception'
what(): [E11] Allocator sys memory check failed: Cannot alloc:131072, consuming tracker:<Load#Id=319afad1df4f438d-b32f415f3c5fe515>, peak used 18678376119, current used -19582129439, exec node:, process memory used 93.90 GB exceed limit 88.00 GB or sys available memory 24.36 GB less than low water mark 1.60 GB.

0#  doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> >) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:173
1#  Allocator<false, false, false>::sys_memory_check(unsigned long) const at /opt/src/be/src/vec/common/allocator.cpp:0
2#  Allocator<false, false, false>::alloc_impl(unsigned long, unsigned long) at /opt/src/be/src/vec/common/allocator.cpp:153
3#  doris::JavaNativeMethods::resizeStringColumn(JNIEnv_*, _jclass*, long, int) at /opt/src/be/src/vec/common/pod_array.h:128
4#  ?
5#  ?

*** Query id: 319afad1df4f438d-b32f415f3c5fe515 ***
*** tablet id: 0 ***
*** Aborted at 1718245575 (unix time) try "date -d @1718245575" if you are using GNU date ***
*** Current BE git commitID: 096a33a ***
*** SIGABRT unknown detail explain (@0x23788) received by PID 145288 (TID 73185 OR 0x7f0003046700) from PID 145288; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /opt/src/be/src/common/signal_handler.h:417
1# 0x00007F00A19EE400 in /lib64/libc.so.6
2# __GI_raise in /lib64/libc.so.6
3# abort in /lib64/libc.so.6
4# __gnu_cxx::__verbose_terminate_handler() [clone .cold] at ../../../../libstdc++-v3/libsupc++/vterminate.cc:75
5# __cxxabiv1::_terminate(void (*)()) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
6# 0x0000558874029101 in /data/apps/doris-2.0.4-bin-x64/be/lib/doris_be
7# 0x0000558874029254 in /data/apps/doris-2.0.4-bin-x64/be/lib/doris_be
8# Allocator<false, false, false>::sys_memory_check(unsigned long) const at /opt/src/be/src/vec/common/allocator.cpp:67
9# Allocator<false, false, false>::alloc_impl(unsigned long, unsigned long) at /opt/src/be/src/vec/common/allocator.h:100
10# doris::JavaNativeMethods::resizeStringColumn(JNIEnv
, _jclass, long, int) at /opt/src/be/src/util/jni_native_method.cpp:30
11# 0x00007F008CEBE5FE

1 Answers

2.0的用户,建议遇到问题,先升级到最新3位版本好,可能这些问题早就被修复了。

sql 控制,可以使用2.1的workload group来看一看。