【已解决】Variant导致Be结点一直重启,如何调优?

Viewed 80

我们有8台be, 每台机器都是32核 100G,只有一张表,加入了一个很宽很深的json,导致所有be一直在重启。
如何微调?使用variant有哪些注意事项?

3 Answers
  1. 经过排查,用户配了自动拉起,并且 2.1.x 部署在 jdk17,因而一直 crash,然后再被拉起。
  2. 需改为 jdk8 再部署,观察 flink taskmanager 日志给出进一步结论。

看一下be.out里,或者be.WARNING 里有没有什么相关的堆栈

W20240411 08:37:44.135708  2204 task_scheduler.cpp:353] Pipeline task failed. query_id: 164ae6dab4bb1a98-a0fbcce7b6630096|164ae6dab4bb1a98-a0fbcce7b6630095 reason: [INTERNAL_ERROR]failed to init reader for file , err: [INTERNAL_ERROR]unknown stream load id: 164ae6dab4bb1a98-a0fbcce7b6630095

        0#  doris::FileFactory::create_pipe_reader(doris::TUniqueId const&, std::shared_ptr<doris::io::FileReader>*, doris::RuntimeState*, bool) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
        1#  doris::vectorized::CsvReader::init_reader(bool) at /root/doris/be/src/common/status.h:452
        2#  doris::vectorized::VFileScanner::_get_next_reader() at /root/doris/be/src/common/status.h:348
        3#  doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/common/status.h:452
        4#  doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/vscanner.cpp:0
        5#  doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/vscanner.cpp:85
        6#  doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /root/doris/be/src/common/status.h:348
        7#  std::_Function_handler<void (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_3>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
        8#  doris::ThreadPool::dispatch_thread() at /root/doris/be/src/util/threadpool.cpp:0
        9#  doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
        10# ?
        11# __clone


        0#  doris::vectorized::VFileScanner::_get_next_reader() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
        1#  doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/common/status.h:452
        2#  doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/vscanner.cpp:0
        3#  doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/vscanner.cpp:85
        4#  doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /root/doris/be/src/common/status.h:348
        5#  std::_Function_handler<void (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_3>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
        6#  doris::ThreadPool::dispatch_thread() at /root/doris/be/src/util/threadpool.cpp:0
        7#  doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
        8#  ?
        9#  __clone