Doris BE频繁重启的问题

Viewed 38

Doris 版本:2.0.3

Doris BE 对应的 be.out 日志如下:

start time: Thu Oct 24 11:54:01 CST 2024
INFO: java_cmd /usr/java/jdk1.8.0_181/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/database/doris/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/database/doris/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/database/doris/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /data/database/doris/be/lib/hadoop_hdfs/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Thu Oct 24 11:54:08 CST 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Thu Oct 24 11:54:08 CST 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Thu Oct 24 12:18:17 CST 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Thu Oct 24 12:37:53 CST 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
F1024 12:45:20.367848 18471 block.cpp:419] Column normal_sal_nos_amt in block is nullptr, in method bytes. All Columns are period_sdate organ_new_no normal_sal_nos_amt normal_sal_amt sal_amt_offline sal_amt_sy_a sal_amt_sy_b sal_amt_o2o shoes_sal_amt_sy clothes_sal_amt_sy pei_sal_amt_sy others_sal_amt_sy sal_amt_dy normal_sal_nos_amt 
*** Check failure stack trace: ***
F1024 12:45:20.367885 18482 block.cpp:419] Column normal_sal_nos_amt in block is nullptr, in method bytes. All Columns are period_sdate organ_new_no normal_sal_nos_amt normal_sal_amt sal_amt_offline sal_amt_sy_a sal_amt_sy_b sal_amt_o2o shoes_sal_amt_sy clothes_sal_amt_sy pei_sal_amt_sy others_sal_amt_sy sal_amt_dy normal_sal_nos_amt 
*** Check failure stack trace: ***
    @     0x55b2c7204219  google::LogMessageFatal::~LogMessageFatal()
    @     0x55b2c7204219  google::LogMessageFatal::~LogMessageFatal()
    @     0x55b2c1f318d9  doris::vectorized::Block::bytes()
    @     0x55b2c1f318d9  doris::vectorized::Block::bytes()
    @     0x55b2bf5c6cef  doris::segment_v2::SegmentIterator::_update_max_row()
    @     0x55b2bf5c6cef  doris::segment_v2::SegmentIterator::_update_max_row()
    @     0x55b2bf5c60b6  doris::segment_v2::SegmentIterator::_next_batch_internal()
    @     0x55b2bf5c60b6  doris::segment_v2::SegmentIterator::_next_batch_internal()
    @     0x55b2bf5c49c2  doris::segment_v2::SegmentIterator::next_batch()
    @     0x55b2bf5c49c2  doris::segment_v2::SegmentIterator::next_batch()
    @     0x55b2bf4e09b4  doris::BetaRowsetReader::next_block()
    @     0x55b2bf4e09b4  doris::BetaRowsetReader::next_block()
    @     0x55b2c6bc371d  doris::vectorized::VCollectIterator::Level0Iterator::refresh_current_row()
    @     0x55b2c6bc371d  doris::vectorized::VCollectIterator::Level0Iterator::refresh_current_row()
    @     0x55b2c6bc3995  doris::vectorized::VCollectIterator::Level0Iterator::ensure_first_row_ref()
    @     0x55b2c6bc3995  doris::vectorized::VCollectIterator::Level0Iterator::ensure_first_row_ref()
    @     0x55b2c6bc5d22  doris::vectorized::VCollectIterator::Level1Iterator::ensure_first_row_ref()
    @     0x55b2c6bc5d22  doris::vectorized::VCollectIterator::Level1Iterator::ensure_first_row_ref()
    @     0x55b2c6bc02d9  doris::vectorized::VCollectIterator::build_heap()
    @     0x55b2c6bc02d9  doris::vectorized::VCollectIterator::build_heap()
    @     0x55b2c6bb2638  doris::vectorized::BlockReader::_init_collect_iter()
    @     0x55b2c6bb2638  doris::vectorized::BlockReader::_init_collect_iter()
    @     0x55b2c6bb3619  doris::vectorized::BlockReader::init()
    @     0x55b2c6bb3619  doris::vectorized::BlockReader::init()
    @     0x55b2c36381e3  doris::vectorized::NewOlapScanner::open()
    @     0x55b2c36381e3  doris::vectorized::NewOlapScanner::open()
    @     0x55b2c364baf7  doris::vectorized::ScannerScheduler::_scanner_scan()
    @     0x55b2c364baf7  doris::vectorized::ScannerScheduler::_scanner_scan()
    @     0x55b2c364d2b1  _ZNSt17_Function_handlerIFvvEZZN5doris10vectorized16ScannerScheduler18_schedule_scannersEPNS2_14ScannerContextEENK3$_1clEvEUlvE1_E9_M_invokeERKSt9_Any_data
    @     0x55b2c364d2b1  _ZNSt17_Function_handlerIFvvEZZN5doris10vectorized16ScannerScheduler18_schedule_scannersEPNS2_14ScannerContextEENK3$_1clEvEUlvE1_E9_M_invokeERKSt9_Any_data
    @     0x55b2beec5332  doris::WorkThreadPool<>::work_thread()
    @     0x55b2beec5332  doris::WorkThreadPool<>::work_thread()
    @     0x55b2c9bde310  execute_native_thread_routine
    @     0x55b2c9bde310  execute_native_thread_routine
    @     0x7f96e2ae9ea5  start_thread
    @     0x7f96e2ae9ea5  start_thread
    @     0x7f96e351896d  __clone
    @     0x7f96e351896d  __clone
    @              (nil)  (unknown)
    @              (nil)  (unknown)
*** Query id: dd7a59626a1f4bdc-b6e753d2254f44e8 ***
*** tablet id: 0 ***
*** Aborted at 1729745120 (unix time) try "date -d @1729745120" if you are using GNU date ***
*** Current BE git commitID: 37d31a5 ***
*** SIGABRT unknown detail explain (@0x407a) received by PID 16506 (TID 18471 OR 0x7f93011be700) from PID 16506; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/src/doris-2.0/be/src/common/signal_handler.h:417
 1# 0x00007F96E3450400 in /lib64/libc.so.6
 2# __GI_raise in /lib64/libc.so.6
 3# abort in /lib64/libc.so.6
 4# 0x000055B2C720A1F9 in /data/database/doris/be/lib/doris_be
 5# google::LogMessage::SendToLog() in /data/database/doris/be/lib/doris_be
 6# google::LogMessage::Flush() in /data/database/doris/be/lib/doris_be
 7# google::LogMessageFatal::~LogMessageFatal() in /data/database/doris/be/lib/doris_be
 8# doris::vectorized::Block::bytes() const in /data/database/doris/be/lib/doris_be
 9# doris::segment_v2::SegmentIterator::_update_max_row(doris::vectorized::Block const*) at /root/src/doris-2.0/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2263
10# doris::segment_v2::SegmentIterator::_next_batch_internal(doris::vectorized::Block*) in /data/database/doris/be/lib/doris_be
11# doris::segment_v2::SegmentIterator::next_batch(doris::vectorized::Block*) at /root/src/doris-2.0/be/src/olap/rowset/segment_v2/segment_iterator.cpp:1876
12# doris::BetaRowsetReader::next_block(doris::vectorized::Block*) at /root/src/doris-2.0/be/src/olap/rowset/beta_rowset_reader.cpp:298
13# doris::vectorized::VCollectIterator::Level0Iterator::refresh_current_row() at /root/src/doris-2.0/be/src/vec/olap/vcollect_iterator.cpp:511
14# doris::vectorized::VCollectIterator::Level0Iterator::ensure_first_row_ref() at /root/src/doris-2.0/be/src/vec/olap/vcollect_iterator.cpp:490
15# doris::vectorized::VCollectIterator::Level1Iterator::ensure_first_row_ref() at /root/src/doris-2.0/be/src/vec/olap/vcollect_iterator.cpp:689
16# doris::vectorized::VCollectIterator::build_heap(std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > >&) at /root/src/doris-2.0/be/src/vec/olap/vcollect_iterator.cpp:184
17# doris::vectorized::BlockReader::_init_collect_iter(doris::TabletReader::ReaderParams const&) at /root/src/doris-2.0/be/src/vec/olap/block_reader.cpp:147
18# doris::vectorized::BlockReader::init(doris::TabletReader::ReaderParams const&) at /root/src/doris-2.0/be/src/vec/olap/block_reader.cpp:226
19# doris::vectorized::NewOlapScanner::open(doris::RuntimeState*) at /root/src/doris-2.0/be/src/vec/exec/scan/new_olap_scanner.cpp:224
20# doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler*, doris::vectorized::ScannerContext*, std::shared_ptr<doris::vectorized::VScanner>) at /root/src/doris-2.0/be/src/vec/exec/scan/scanner_scheduler.cpp:345
21# std::_Function_handler<void (), doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext*)::$_1::operator()() const::{lambda()#3}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
22# doris::WorkThreadPool<true>::work_thread(int) at /root/src/doris-2.0/be/src/util/work_thread_pool.hpp:160
23# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
24# start_thread in /lib64/libpthread.so.0
25# clone in /lib64/libc.so.6

Query id查询对应的SQL:

with t0 AS 
    (SELECT sdate AS sdate_flag,
		sum(sal_amt) AS sal_amt
    FROM sal_kpi
    WHERE sdate BETWEEN '20231' AND '20233'
    GROUP BY  sdate_flag),
    t1 AS 
    (SELECT concat(substring(sdate, 1, 4)+1) AS sdate_flag,
		 sum(sal_amt) AS sal_amt_yoy
    FROM sal_kpi
    WHERE sdate BETWEEN '20221' AND '20223'
    GROUP BY  sdate_flag)
SELECT t0.sdate_flag,
		 t0.sal_amt,
		t1.sal_amt_yoy 
        from t0 full JOIN t1 ON t0.sdate_flag = t1.sdate_flag
        order by t0.sdate_flag
1 Answers

这个问题fix过了,建议升级下Doris版本,建议升级到2.1.6 版本上