be集群4台全部宕机,初步判定routine数据导入引起

Viewed 75

版本2.0.13
以下为 dmesg -T 日志

[一 8月 19 06:17:30 2024] brpc_light invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[一 8月 19 06:17:30 2024] brpc_light cpuset=/ mems_allowed=0
[一 8月 19 06:17:30 2024] CPU: 11 PID: 74600 Comm: brpc_light Kdump: loaded Not tainted 3.10.0-1160.el7.x86_64 #1
[一 8月 19 06:17:30 2024] Hardware name: Nutanix AHV, BIOS 1.11.0-2.el7 04/01/2014
[一 8月 19 06:17:30 2024] Call Trace:
[一 8月 19 06:17:30 2024]  [<ffffffffbd981340>] dump_stack+0x19/0x1b
[一 8月 19 06:17:30 2024]  [<ffffffffbd97bc60>] dump_header+0x90/0x229
[一 8月 19 06:17:30 2024]  [<ffffffffbd306362>] ? ktime_get_ts64+0x52/0xf0
[一 8月 19 06:17:30 2024]  [<ffffffffbd35db7f>] ? delayacct_end+0x8f/0xb0
[一 8月 19 06:17:30 2024]  [<ffffffffbd3c208d>] oom_kill_process+0x2cd/0x490
[一 8月 19 06:17:30 2024]  [<ffffffffbd3c1a7d>] ? oom_unkillable_task+0xcd/0x120
[一 8月 19 06:17:30 2024]  [<ffffffffbd3c277a>] out_of_memory+0x31a/0x500
[一 8月 19 06:17:30 2024]  [<ffffffffbd97c77d>] __alloc_pages_slowpath+0x5db/0x729
[一 8月 19 06:17:30 2024]  [<ffffffffbd3c8d76>] __alloc_pages_nodemask+0x436/0x450
[一 8月 19 06:17:30 2024]  [<ffffffffbd4189d8>] alloc_pages_current+0x98/0x110
[一 8月 19 06:17:30 2024]  [<ffffffffbd3bdb47>] __page_cache_alloc+0x97/0xb0
[一 8月 19 06:17:30 2024]  [<ffffffffbd3c0ae0>] filemap_fault+0x270/0x420
[一 8月 19 06:17:30 2024]  [<ffffffffc02c791e>] __xfs_filemap_fault+0x7e/0x1d0 [xfs]
[一 8月 19 06:17:30 2024]  [<ffffffffc02c7b1c>] xfs_filemap_fault+0x2c/0x30 [xfs]
[一 8月 19 06:17:30 2024]  [<ffffffffbd3ede3a>] __do_fault.isra.61+0x8a/0x100
[一 8月 19 06:17:30 2024]  [<ffffffffbd3ee3ec>] do_read_fault.isra.63+0x4c/0x1b0
[一 8月 19 06:17:30 2024]  [<ffffffffbd3f5c30>] handle_mm_fault+0xa20/0xfb0
[一 8月 19 06:17:30 2024]  [<ffffffffbd312040>] ? futex_wake+0x90/0x180
[一 8月 19 06:17:30 2024]  [<ffffffffbd98e653>] __do_page_fault+0x213/0x500
[一 8月 19 06:17:30 2024]  [<ffffffffbd98ea26>] trace_do_page_fault+0x56/0x150
[一 8月 19 06:17:30 2024]  [<ffffffffbd98dfa2>] do_async_page_fault+0x22/0xf0
[一 8月 19 06:17:30 2024]  [<ffffffffbd98a7a8>] async_page_fault+0x28/0x30
[一 8月 19 06:17:30 2024] Mem-Info:
[一 8月 19 06:17:30 2024] active_anon:15669256 inactive_anon:51001 isolated_anon:0
 active_file:2354 inactive_file:3412 isolated_file:0
 unevictable:0 dirty:45 writeback:0 unstable:0
 slab_reclaimable:50116 slab_unreclaimable:34149
 mapped:2376 shmem:68201 pagetables:73795 bounce:0
 free:82204 free_pcp:1192 free_cma:0
[一 8月 19 06:17:30 2024] Node 0 DMA free:15876kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:32kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[一 8月 19 06:17:30 2024] lowmem_reserve[]: 0 2827 64212 64212
[一 8月 19 06:17:30 2024] Node 0 DMA32 free:248476kB min:2972kB low:3712kB high:4456kB active_anon:2569480kB inactive_anon:6576kB active_file:1048kB inactive_file:1680kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129188kB managed:2895684kB mlocked:0kB dirty:4kB writeback:0kB mapped:64kB shmem:15304kB slab_reclaimable:37576kB slab_unreclaimable:7568kB kernel_stack:2160kB pagetables:5564kB unstable:0kB bounce:0kB free_pcp:1040kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:4818 all_unreclaimable? yes
[一 8月 19 06:17:30 2024] lowmem_reserve[]: 0 0 61384 61384
[一 8月 19 06:17:30 2024] Node 0 Normal free:64464kB min:64592kB low:80740kB high:96888kB active_anon:60107544kB inactive_anon:197428kB active_file:8368kB inactive_file:11968kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63963136kB managed:62860792kB mlocked:0kB dirty:176kB writeback:0kB mapped:9440kB shmem:257500kB slab_reclaimable:162888kB slab_unreclaimable:128996kB kernel_stack:41328kB pagetables:289616kB unstable:0kB bounce:0kB free_pcp:3728kB local_pcp:36kB free_cma:0kB writeback_tmp:0kB pages_scanned:30829 all_unreclaimable? yes
[一 8月 19 06:17:30 2024] lowmem_reserve[]: 0 0 0 0
[一 8月 19 06:17:30 2024] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15876kB
[一 8月 19 06:17:30 2024] Node 0 DMA32: 1564*4kB (UE) 1906*8kB (UE) 1030*16kB (UEM) 3230*32kB (UEM) 1269*64kB (UEM) 200*128kB (UEM) 2*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 248672kB
[一 8月 19 06:17:30 2024] Node 0 Normal: 16072*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 64288kB
[一 8月 19 06:17:30 2024] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[一 8月 19 06:17:30 2024] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[一 8月 19 06:17:30 2024] 73623 total pagecache pages
[一 8月 19 06:17:30 2024] 0 pages in swap cache
[一 8月 19 06:17:30 2024] Swap cache stats: add 16089, delete 16089, find 33832/34067
[一 8月 19 06:17:30 2024] Free swap  = 0kB
[一 8月 19 06:17:30 2024] Total swap = 0kB
[一 8月 19 06:17:30 2024] 16777079 pages RAM
[一 8月 19 06:17:30 2024] 0 pages HighMem/MovableOnly
[一 8月 19 06:17:30 2024] 333983 pages reserved
[一 8月 19 06:17:30 2024] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[一 8月 19 06:17:30 2024] [  627]     0   627     9765     1565      27        0             0 systemd-journal
[一 8月 19 06:17:30 2024] [  649]     0   649    68076      641      30        0             0 lvmetad
[一 8月 19 06:17:30 2024] [  667]     0   667    11509      346      22        0         -1000 systemd-udevd
[一 8月 19 06:17:30 2024] [  876]     0   876    13883      173      27        0         -1000 auditd
[一 8月 19 06:17:30 2024] [  899]     0   899    13213      270      31        0             0 smartd
[一 8月 19 06:17:30 2024] [  904]   998   904     2145       82      10        0             0 lsmd
[一 8月 19 06:17:30 2024] [  905]     0   905    57104      550      66        0             0 abrtd
[一 8月 19 06:17:30 2024] [  906]     0   906     6596      143      18        0             0 systemd-logind
[一 8月 19 06:17:30 2024] [  907]    81   907    15086      269      33        0          -900 dbus-daemon
[一 8月 19 06:17:30 2024] [  909]     0   909    56479      426      64        0             0 abrt-watch-log
[一 8月 19 06:17:30 2024] [  910]     0   910    22642      260      45        0             0 rngd
[一 8月 19 06:17:30 2024] [  911]    38   911     6954      224      19        0             0 ntpd
[一 8月 19 06:17:30 2024] [  912]     0   912   155555      582      87        0             0 NetworkManager
[一 8月 19 06:17:30 2024] [  918]   999   918   153579     1634      63        0             0 polkitd
[一 8月 19 06:17:30 2024] [  923]     0   923     5420      138      15        0             0 irqbalance
[一 8月 19 06:17:30 2024] [  932]     0   932    31596      221      22        0             0 crond
[一 8月 19 06:17:30 2024] [  936]     0   936     6477       76      19        0             0 atd
[一 8月 19 06:17:30 2024] [  941]     0   941    27551       81      10        0             0 agetty
[一 8月 19 06:17:30 2024] [  942]    32   942    17314      144      37        0             0 rpcbind
[一 8月 19 06:17:30 2024] [ 1187]     0  1187    28225      311      56        0         -1000 sshd
[一 8月 19 06:17:30 2024] [ 1188]     0  1188   143571     2868      97        0             0 tuned
[一 8月 19 06:17:30 2024] [ 1191]     0  1191   125815     1568      97        0             0 rsyslogd
[一 8月 19 06:17:30 2024] [ 1192]   996  1192   181806     3609      33        0             0 node_exporter
[一 8月 19 06:17:30 2024] [ 1193]     0  1193   437635    10211     106        0             0 promtail
[一 8月 19 06:17:30 2024] [ 1377]     0  1377    22948      324      43        0             0 master
[一 8月 19 06:17:30 2024] [ 1448]    89  1448    23018      332      46        0             0 qmgr
[一 8月 19 06:17:30 2024] [73080]  1000 73080 52478457 15571409   72761        0             0 doris_be
[一 8月 19 06:17:30 2024] [ 9554]    89  9554    22974      300      43        0             0 pickup
[一 8月 19 06:17:30 2024] Out of memory: Kill process 73080 (doris_be) score 951 or sacrifice child
[一 8月 19 06:17:30 2024] Killed process 73080 (doris_be), UID 1000, total-vm:209913828kB, anon-rss:62285724kB, file-rss:0kB, shmem-rss:0kB

以下为 be.info 部分异常日志

Process Memory Summary:
    os physical memory 62.73 GB. process memory used 53.31 GB, limit 50.18 GB, soft limit 45.16 GB. sys available memory 5.34 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
Memory Tracker Summary:
    Type=experimental, Used=0(0 B), Peak=0(0 B)
    Type=clone, Used=0(0 B), Peak=148.22 KB(151776 B)
    Type=schema_change, Used=0(0 B), Peak=0(0 B)
    Type=compaction, Used=0(0 B), Peak=7.11 GB(7631602386 B)
    Type=load, Used=17.14 GB(18405021358 B), Peak=17.14 GB(18405196078 B)
    Type=query, Used=305.21 MB(320038123 B), Peak=31.74 GB(34085414696 B)
    Type=global, Used=49.36 GB(53000301568 B), Peak=49.36 GB(53000301568 B)
    Type=tc/jemalloc_free_memory, Used=2.87 GB(3076786496 B), Peak=-1.00 B(-1 B)
    Type=process, Used=69.66 GB(74802147545 B), Peak=-1.00 B(-1 B)
    MemTrackerLimiter Label=ObjLRUCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=Orphan, Type=global, Limit=-1.00 B(-1 B), Used=48.22 GB(51780410437 B), Peak=48.29 GB(51850101349 B)
    MemTracker Label=PageNoCache, Parent Label=Orphan, Used=0(0 B), Peak=123.46 MB(129452508 B)
    MemTracker Label=IOBufBlockMemory, Parent Label=Orphan, Used=81.54 MB(85499904 B), Peak=151.13 MB(158474240 B)
    MemTracker Label=InvertedIndexSearcherCache, Parent Label=Orphan, Used=14.81 KB(15168 B), Peak=14.81 KB(15168 B)
    MemTracker Label=SegCompaction, Parent Label=Orphan, Used=1.06 MB(1112000 B), Peak=9.58 MB(10047608 B)
        19# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        20# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
        21# start_thread
        22# clone
W0819 06:23:25.689384 245269 thread_mem_tracker_mgr.h:182] malloc or new large memory: 10737418240, in query or load: a57ead15322847f3-9df77c157f536346, this is just a warning, not prevent memory alloc, stacktrace:

        0#  doris::ThreadMemTrackerMgr::consume(long, bool) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
        1#  realloc at /home/zcp/repo_center/doris_release/doris/be/src/runtime/thread_context.h:0
        2#  Allocator<true, true, false>::realloc_impl(void*, unsigned long, unsigned long, unsigned long) at /home/zcp/repo_center/doris_release/doris/be/src/vec/common/allocator.h:170
        6#  doris::vectorized::HashJoinNode::_process_build_block(doris::RuntimeState*, doris::vectorized::Block&, unsigned char) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:357
        7#  doris::vectorized::HashJoinNode::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        8#  doris::vectorized::HashJoinNode::_materialize_build_side(doris::RuntimeState*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        9#  doris::vectorized::VJoinNodeBase::open(doris::RuntimeState*) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1147
        10# doris::vectorized::HashJoinNode::open(doris::RuntimeState*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        11# doris::PlanFragmentExecutor::open_vectorized_internal() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        12# doris::PlanFragmentExecutor::open() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:357
        13# doris::FragmentExecState::execute() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:180
        16# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        17# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
        18# start_thread
        19# clone
W0819 06:23:27.134289 73123 mem_tracker_limiter.cpp:289]
Process Memory Summary:
    os physical memory 62.73 GB. process memory used 53.06 GB, limit 50.18 GB, soft limit 45.16 GB. sys available memory 5.57 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
Memory Tracker Summary:
    Type=experimental, Used=0(0 B), Peak=0(0 B)
    Type=clone, Used=0(0 B), Peak=148.22 KB(151776 B)
    Type=schema_change, Used=0(0 B), Peak=0(0 B)
    Type=compaction, Used=0(0 B), Peak=7.11 GB(7631602386 B)
    Type=load, Used=17.14 GB(18405196078 B), Peak=17.14 GB(18405196078 B)
    Type=query, Used=304.14 MB(318910715 B), Peak=31.74 GB(34085414696 B)
    Type=global, Used=49.27 GB(52903002877 B), Peak=49.34 GB(52973820098 B)
    Type=tc/jemalloc_free_memory, Used=2.91 GB(3122378624 B), Peak=-1.00 B(-1 B)
    Type=process, Used=69.62 GB(74749488294 B), Peak=-1.00 B(-1 B)
    MemTrackerLimiter Label=ObjLRUCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=Orphan, Type=global, Limit=-1.00 B(-1 B), Used=48.13 GB(51678398189 B), Peak=48.29 GB(51850101349 B)
    MemTracker Label=PageNoCache, Parent Label=Orphan, Used=0(0 B), Peak=123.46 MB(129452508 B)
    MemTracker Label=IOBufBlockMemory, Parent Label=Orphan, Used=82.04 MB(86024192 B), Peak=151.13 MB(158474240 B)
    MemTracker Label=InvertedIndexSearcherCache, Parent Label=Orphan, Used=14.81 KB(15168 B), Peak=14.81 KB(15168 B)
    MemTracker Label=SegCompaction, Parent Label=Orphan, Used=1.06 MB(1112000 B), Peak=9.58 MB(10047608 B)
    MemTracker Label=SegmentMeta, Parent Label=Orphan, Used=2.24 KB(2297 B), Peak=32.86 MB(34458520 B)
    MemTracker Label=SnapshotManager, Parent Label=Orphan, Used=48.84 MB(51208240 B), Peak=72.00 MB(75501472 B)
    MemTrackerLimiter Label=DataPageCache, Type=global, Limit=-1.00 B(-1 B), Used=839.20 MB(879963181 B), Peak=9.01 GB(9675602506 B)
    MemTrackerLimiter Label=IndexPageCache, Type=global, Limit=-1.00 B(-1 B), Used=86.51 MB(90713398 B), Peak=591.59 MB(620331232 B)
    MemTrackerLimiter Label=PKIndexPageCache, Type=global, Limit=-1.00 B(-1 B), Used=55.56 MB(58262002 B), Peak=120.15 MB(125984616 B)
    MemTrackerLimiter Label=RowCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=SegmentCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=SchemaCache, Type=global, Limit=-1.00 B(-1 B), Used=11.84 MB(12410552 B), Peak=204.22 MB(214142465 B)
    MemTrackerLimiter Label=LookupConnectionCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=InvertedIndexSearcherCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=InvertedIndexQueryCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=LastestSuccessChannelCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=TabletVersionCache, Type=global, Limit=-1.00 B(-1 B), Used=3.00 MB(3145917 B), Peak=3.00 MB(3145917 B)
    MemTrackerLimiter Label=CreateTabletRRIdxCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=DeleteBitmap AggCache, Type=global, Limit=-1.00 B(-1 B), Used=171.82 MB(180167494 B), Peak=171.82 MB(180167494 B)
    MemTrackerLimiter Label=Load#Id=a57ead15322847f3-9df77c157f5362bc, Type=load, Limit=8.00 GB(8589934592 B), Used=17.09 GB(18350461982 B), Peak=17.09 GB(18351576094 B)
    MemTrackerLimiter Label=Query#Id=9b640a3f4e644ef4-b4628f2935cadfe1, Type=query, Limit=2.00 GB(2147483648 B), Used=113.70 MB(119219475 B), Peak=113.70 MB(119219475 B)
    MemTrackerLimiter Label=Query#Id=dfa29facb31e4fc9-8ccd2f5f55d03c3d, Type=query, Limit=8.00 GB(8589934592 B), Used=81.65 MB(85615480 B), Peak=85.25 MB(89388216 B)
    MemTrackerLimiter Label=Query#Id=8bcc269784c4465d-8737c813bd6a31aa, Type=query, Limit=8.00 GB(8589934592 B), Used=40.81 MB(42795480 B), Peak=118.26 MB(124009416 B)
    MemTrackerLimiter Label=Load#Id=d8020c07df434f3f-88d40d7be053ae1a, Type=load, Limit=2.00 GB(2147483648 B), Used=25.50 MB(26741712 B), Peak=25.50 MB(26741712 B)
    MemTrackerLimiter Label=Load#Id=db452b6db54a489c-b92f7d660a474dda, Type=load, Limit=2.00 GB(2147483648 B), Used=25.34 MB(26566976 B), Peak=26.51 MB(27796896 B)
    MemTrackerLimiter Label=Query#Id=7cb50928fc8443d4-bc9ea98273e637ba, Type=query, Limit=8.00 GB(8589934592 B), Used=20.83 MB(21841368 B), Peak=29.96 MB(31411840 B)
    MemTrackerLimiter Label=Query#Id=1c596c74d24a4f2e-b2e2e128dbfecac7, Type=query, Limit=8.00 GB(8589934592 B), Used=20.15 MB(21130416 B), Peak=29.41 MB(30837824 B)
    MemTrackerLimiter Label=Query#Id=b92fdb9bdb70476c-a7885c12e76e8f80, Type=query, Limit=8.00 GB(8589934592 B), Used=11.27 MB(11822536 B), Peak=11.27 MB(11822536 B)
    MemTrackerLimiter Label=Query#Id=6eb60bbd89db49ab-bfe42ebeed2adb62, Type=query, Limit=8.00 GB(8589934592 B), Used=10.83 MB(11354080 B), Peak=10.83 MB(11354080 B)
    MemTrackerLimiter Label=Query#Id=b0ed3491fec1432e-a25e77aa71399dec, Type=query, Limit=8.00 GB(8589934592 B), Used=2.57 MB(2697568 B), Peak=3.31 MB(3472272 B)
    MemTrackerLimiter Label=LoadChannelMgr, Type=load, Limit=-1.00 B(-1 B), Used=1.36 MB(1425408 B), Peak=3.88 GB(4164400302 B)
    MemTrackerLimiter Label=Query#Id=50fd9b67d0b7436c-ba1db46d4bdf050c, Type=query, Limit=8.00 GB(8589934592 B), Used=1.35 MB(1418656 B), Peak=1.35 MB(1418656 B)
    MemTrackerLimiter Label=Query#Id=701d2ea5757a4859-a5aa3410a8c735c2, Type=query, Limit=8.00 GB(8589934592 B), Used=784.09 KB(802904 B), Peak=784.09 KB(802904 B)
    MemTrackerLimiter Label=Query#Id=5213076aac874065-88b8083ff8e8a3d1, Type=query, Limit=8.00 GB(8589934592 B), Used=221.50 KB(226816 B), Peak=221.50 KB(226816 B)
    MemTrackerLimiter Label=LoadChannelMgr, Type=load, Limit=-1.00 B(-1 B), Used=1.36 MB(1425408 B), Peak=3.88 GB(4164400302 B)
W0819 06:23:27.433091 73123 mem_tracker_limiter.cpp:289]
Process Memory Summary:
    os physical memory 62.73 GB. process memory used 53.31 GB, limit 50.18 GB, soft limit 45.16 GB. sys available memory 5.34 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
Memory Tracker Summary:
    Type=experimental, Used=0(0 B), Peak=0(0 B)
    Type=clone, Used=0(0 B), Peak=148.22 KB(151776 B)
    Type=schema_change, Used=0(0 B), Peak=0(0 B)
    Type=compaction, Used=0(0 B), Peak=7.11 GB(7631602386 B)
    Type=load, Used=17.14 GB(18405021358 B), Peak=17.14 GB(18405196078 B)
    Type=query, Used=305.21 MB(320038123 B), Peak=31.74 GB(34085414696 B)
    Type=global, Used=49.36 GB(53000301568 B), Peak=49.36 GB(53000301568 B)
    Type=tc/jemalloc_free_memory, Used=2.87 GB(3076786496 B), Peak=-1.00 B(-1 B)
    Type=process, Used=69.66 GB(74802147545 B), Peak=-1.00 B(-1 B)
    MemTrackerLimiter Label=ObjLRUCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=Orphan, Type=global, Limit=-1.00 B(-1 B), Used=48.22 GB(51780410437 B), Peak=48.29 GB(51850101349 B)
    MemTracker Label=PageNoCache, Parent Label=Orphan, Used=0(0 B), Peak=123.46 MB(129452508 B)
    MemTracker Label=IOBufBlockMemory, Parent Label=Orphan, Used=81.54 MB(85499904 B), Peak=151.13 MB(158474240 B)
    MemTracker Label=InvertedIndexSearcherCache, Parent Label=Orphan, Used=14.81 KB(15168 B), Peak=14.81 KB(15168 B)
    MemTracker Label=SegCompaction, Parent Label=Orphan, Used=1.06 MB(1112000 B), Peak=9.58 MB(10047608 B)
    MemTracker Label=SegmentMeta, Parent Label=Orphan, Used=2.24 KB(2297 B), Peak=32.86 MB(34458520 B)
    MemTracker Label=SnapshotManager, Parent Label=Orphan, Used=48.84 MB(51208240 B), Peak=72.00 MB(75501472 B)
    MemTrackerLimiter Label=DataPageCache, Type=global, Limit=-1.00 B(-1 B), Used=839.20 MB(879963181 B), Peak=9.01 GB(9675602506 B)
    MemTrackerLimiter Label=IndexPageCache, Type=global, Limit=-1.00 B(-1 B), Used=81.96 MB(85941985 B), Peak=591.59 MB(620331232 B)
    MemTrackerLimiter Label=PKIndexPageCache, Type=global, Limit=-1.00 B(-1 B), Used=55.56 MB(58262002 B), Peak=120.15 MB(125984616 B)
    MemTrackerLimiter Label=RowCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=SegmentCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=SchemaCache, Type=global, Limit=-1.00 B(-1 B), Used=11.84 MB(12410552 B), Peak=204.22 MB(214142465 B)
    MemTrackerLimiter Label=LookupConnectionCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=InvertedIndexSearcherCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=InvertedIndexQueryCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=LastestSuccessChannelCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=TabletVersionCache, Type=global, Limit=-1.00 B(-1 B), Used=3.00 MB(3145917 B), Peak=3.00 MB(3145917 B)
    MemTrackerLimiter Label=CreateTabletRRIdxCache, Type=global, Limit=-1.00 B(-1 B), Used=0(0 B), Peak=0(0 B)
    MemTrackerLimiter Label=DeleteBitmap AggCache, Type=global, Limit=-1.00 B(-1 B), Used=171.82 MB(180167494 B), Peak=171.82 MB(180167494 B)
    MemTrackerLimiter Label=Load#Id=a57ead15322847f3-9df77c157f5362bc, Type=load, Limit=8.00 GB(8589934592 B), Used=17.09 GB(18350461982 B), Peak=17.09 GB(18351576094 B)
    MemTrackerLimiter Label=Query#Id=9b640a3f4e644ef4-b4628f2935cadfe1, Type=query, Limit=2.00 GB(2147483648 B), Used=113.70 MB(119219475 B), Peak=113.70 MB(119219475 B)
    MemTrackerLimiter Label=Query#Id=dfa29facb31e4fc9-8ccd2f5f55d03c3d, Type=query, Limit=8.00 GB(8589934592 B), Used=82.71 MB(86728824 B), Peak=85.25 MB(89388216 B)
    MemTrackerLimiter Label=Query#Id=8bcc269784c4465d-8737c813bd6a31aa, Type=query, Limit=8.00 GB(8589934592 B), Used=40.81 MB(42795480 B), Peak=118.26 MB(124009416 B)
    MemTrackerLimiter Label=Load#Id=d8020c07df434f3f-88d40d7be053ae1a, Type=load, Limit=2.00 GB(2147483648 B), Used=25.34 MB(26566992 B), Peak=26.51 MB(27796912 B)
    MemTrackerLimiter Label=Load#Id=db452b6db54a489c-b92f7d660a474dda, Type=load, Limit=2.00 GB(2147483648 B), Used=25.34 MB(26566976 B), Peak=26.51 MB(27796896 B)
    MemTrackerLimiter Label=Query#Id=7cb50928fc8443d4-bc9ea98273e637ba, Type=query, Limit=8.00 GB(8589934592 B), Used=20.83 MB(21841368 B), Peak=29.96 MB(31411840 B)
    MemTrackerLimiter Label=Query#Id=1c596c74d24a4f2e-b2e2e128dbfecac7, Type=query, Limit=8.00 GB(8589934592 B), Used=20.15 MB(21130416 B), Peak=29.41 MB(30837824 B)
    MemTrackerLimiter Label=Query#Id=b92fdb9bdb70476c-a7885c12e76e8f80, Type=query, Limit=8.00 GB(8589934592 B), Used=11.27 MB(11822536 B), Peak=11.27 MB(11822536 B)
    MemTrackerLimiter Label=Query#Id=6eb60bbd89db49ab-bfe42ebeed2adb62, Type=query, Limit=8.00 GB(8589934592 B), Used=10.83 MB(11354080 B), Peak=10.83 MB(11354080 B)
    MemTrackerLimiter Label=Query#Id=b0ed3491fec1432e-a25e77aa71399dec, Type=query, Limit=8.00 GB(8589934592 B), Used=2.57 MB(2697568 B), Peak=3.31 MB(3472272 B)
    MemTrackerLimiter Label=LoadChannelMgr, Type=load, Limit=-1.00 B(-1 B), Used=1.36 MB(1425408 B), Peak=3.88 GB(4164400302 B)
    MemTrackerLimiter Label=Query#Id=50fd9b67d0b7436c-ba1db46d4bdf050c, Type=query, Limit=8.00 GB(8589934592 B), Used=1.35 MB(1418656 B), Peak=1.35 MB(1418656 B)
    MemTrackerLimiter Label=Query#Id=701d2ea5757a4859-a5aa3410a8c735c2, Type=query, Limit=8.00 GB(8589934592 B), Used=784.09 KB(802904 B), Peak=784.09 KB(802904 B)
    MemTrackerLimiter Label=Query#Id=5213076aac874065-88b8083ff8e8a3d1, Type=query, Limit=8.00 GB(8589934592 B), Used=221.50 KB(226816 B), Peak=221.50 KB(226816 B)
    MemTrackerLimiter Label=LoadChannelMgr, Type=load, Limit=-1.00 B(-1 B), Used=1.36 MB(1425408 B), Peak=3.88 GB(4164400302 B)
2 Answers

日志补充

 thread_mem_tracker_mgr.h:182] malloc or new large memory: 2684354560, in query or load: a57ead15322847f3-9df77c157f536346, this is just a warning, not prevent memory alloc, stacktrace:

        0#  doris::ThreadMemTrackerMgr::consume(long, bool) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
        1#  realloc at /home/zcp/repo_center/doris_release/doris/be/src/runtime/thread_context.h:0
        2#  Allocator<true, true, false>::realloc_impl(void*, unsigned long, unsigned long, unsigned long) at /home/zcp/repo_center/doris_release/doris/be/src/vec/common/allocator.h:170
        3#  HashTable<doris::vectorized::UInt128, HashMapCell<doris::vectorized::UInt128, doris::vectorized::RowRefList, HashCRC32<doris::vectorized::UInt128>, HashTableNoState>, HashCRC32<doris::vectorized::UInt128>, PartitionedHashTableGrower<8ul>, Allocator<true, true, false> >::resize(unsigned long, unsigned long) at /home/zcp/repo_center/doris_release/doris/be/src/vec/common/hash_table/hash_table.h:0
        4#  doris::Status doris::vectorized::ProcessHashTableBuild<doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt128, false, doris::vectorized::RowRefList> >::run<false, false>(doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt128, false, doris::vectorized::RowRefList>&, doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false>, 15ul, 16ul> const*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/common/hash_table/hash_table.h:0
        5#  _ZNSt8__detail9__variant17__gen_vtable_implINS0_12_Multi_arrayIPFNS0_21__deduce_visit_resultIN5doris6StatusEEEONS4_10vectorized8OverloadIJZNS7_12HashJoinNode20_process_build_blockEPNS4_12RuntimeStateERNS7_5BlockEhE3$_0ZNS9_20_process_build_blockESB_SD_hE3$_1EEERSt7variantIJSt9monostateNS7_26SerializedHashTableContextINS7_10RowRefListEEENS7_27PrimaryTypeHashTableContextIhSL_EENSN_ItSL_EENSN_IjSL_EENSN_ImSL_EENSN_INS7_7UInt128ESL_EENSN_INS7_7UInt256ESL_EENS7_24FixedKeyHashTableContextImLb1ESL_EENSW_ImLb0ESL_EENSW_ISS_Lb1ESL_EENSW_ISS_Lb0ESL_EENSW_ISU_Lb1ESL_EENSW_ISU_Lb0ESL_EENSK_INS7_18RowRefListWithFlagEEENSN_IhS13_EENSN_ItS13_EENSN_IjS13_EENSN_ImS13_EENSN_ISS_S13_EENSN_ISU_S13_EENSW_ImLb1ES13_EENSW_ImLb0ES13_EENSW_ISS_Lb1ES13_EENSW_ISS_Lb0ES13_EENSW_ISU_Lb1ES13_EENSW_ISU_Lb0ES13_EENSK_INS7_19RowRefListWithFlagsEEENSN_IhS1H_EENSN_ItS1H_EENSN_IjS1H_EENSN_ImS1H_EENSN_ISS_S1H_EENSN_ISU_S1H_EENSW_ImLb1ES1H_EENSW_ImLb0ES1H_EENSW_ISS_Lb1ES1H_EENSW_ISS_Lb0ES1H_EENSW_ISU_Lb1ES1H_EENSW_ISU_Lb0ES1H_EEEEOSI_IJSt17integral_constantIbLb0EES1X_IbLb1EEEES21_EJEEESt16integer_sequenceImJLm11ELm0ELm0EEEE14__visit_invokeESH_S1W_S21_S21_ at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:335
        6#  doris::vectorized::HashJoinNode::_process_build_block(doris::RuntimeState*, doris::vectorized::Block&, unsigned char) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:357
        7#  doris::vectorized::HashJoinNode::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        8#  doris::vectorized::HashJoinNode::_materialize_build_side(doris::RuntimeState*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        9#  doris::vectorized::VJoinNodeBase::open(doris::RuntimeState*) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1147
        10# doris::vectorized::HashJoinNode::open(doris::RuntimeState*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        11# doris::PlanFragmentExecutor::open_vectorized_internal() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        12# doris::PlanFragmentExecutor::open() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:357
        13# doris::FragmentExecState::execute() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:180
        14# doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::RuntimeState*, doris::Status*)> const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        15# std::_Function_handler<void (), doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
        16# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        17# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
        18# start_thread
        19# clone

【问题状态】跟进中
【问题处理】定位中,有进展会同步到论坛