请教一个问题,试用了DorisV2.1.0的memtable前置功能,可以使INSERT INTO SELECT速度快1倍,在查阅官方文档以及源码的时候,只找到了在stream load导入文件数据到表中这一种方法(-H"memtable_on_sink_node:true"),但在测试2.1二进制arm64版本,插入tpch测试数据customer.tbl的内容的时候
第一次测试结果如下:
不设置memtable_on_sink_node配置,1副本平均为17s,4副本平均为41s
curl --location-trusted -u root: -H "column_separator:|" -H "columns: c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment, temp" -T customer.tbl http://192.168.9.103:38030/api/cjhtest/memtabletest/_stream_load
设置memtable_on_sink_node为false,1副本平均为17s,4副本平均为41s
curl --location-trusted -u root: -H"memtable_on_sink_node:false" -H "column_separator:|" -H "columns: c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment, temp" -T customer.tbl http://192.168.9.103:38030/api/cjhtest/memtabletest/_stream_load
设置memtable_on_sink_node为true,1副本平均为23s,4副本平均为51s
curl --location-trusted -u root: -H"memtable_on_sink_node:true" -H "column_separator:|" -H "columns: c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment, temp" -T customer.tbl http://192.168.9.103:38030/api/cjhtest/memtabletest/_stream_load
第二次测试时在mysql连接中设置:
SET GLOBAL enable_memtable_on_sink_node = true;
也差不多,不设true时1副本耗时22s,3副本耗时30s;设为true时也是1副本耗时22s,3副本耗时30s。
硬件配置:
CPU型号:HUAWEI kunpeng920 5250
内存:32G x 8(=256G)
硬盘:SATA 8T8 + SSD 960G2
系统版本:银河麒麟v10
这么看下来,设置了memtable前置之后,速度几乎不变甚至还变慢了,是我理解错了或是使用方法不对吗?
===================2024.3.26 11:05====================
在新的测试中发现,不管enable_memtable_on_sink_node是true是false,不管是否重启了集群,be.INFO的日志并不是前置设置true显示load_stream,设置false显示load_channel,而是每3-4次load_channel之后,出现一次load_stream。而且stream与channel的导入耗时是相同的。
具体load_stream日志如下:
I20240326 09:27:11.849537 3595846 stream_load.cpp:198] new income streaming load request.id=5945a84ef1fa6dcd-041277498bf1b09a, job_id=-1, txn_id=-1, label=aa283492-f0f5-4e12-9d98-77361a3eddf1, elapse(s)=0, db=cjhtest, tbl=memtabletest, group_commit=0
I20240326 09:27:12.066603 3595846 stream_load_executor.cpp:71] begin to execute stream load. label=aa283492-f0f5-4e12-9d98-77361a3eddf1, txn_id=2, query_id=0-0
I20240326 09:27:12.066700 3595846 fragment_mgr.cpp:623] query_id: 5945a84ef1fa6dcd-041277498bf1b09a coord_addr TNetworkAddress(hostname=192.168.99.92, port=39020) total fragment num on current host: 0 fe process uuid: 0
I20240326 09:27:12.066722 3595846 fragment_mgr.cpp:648] Query/load id: 5945a84ef1fa6dcd-41277498bf1b09a, use task group: TG[id = 1, name = normal, cpu_share = 1024, memory_limit = 68.60 GB, enable_memory_overcommit = true, version = 0, cpu_hard_limit = -1, scan_thread_num = 192, max_remote_scan_thread_num = 192, min_remote_scan_thread_num = 192], is pipeline: 1, enable cgroup soft limit: 1
I20240326 09:27:12.066756 3595846 fragment_mgr.cpp:661] Register query/load memory tracker, query/load id: 5945a84ef1fa6dcd-41277498bf1b09a limit: 0
I20240326 09:27:12.066777 3595846 pipeline_fragment_context.cpp:240] Preparing instance 5945a84ef1fa6dcd-41277498bf1b09a|5945a84ef1fa6dcd-41277498bf1b09b, backend_num 0
I20240326 09:27:12.067104 3595846 stream_load.cpp:204] finished to handle HTTP header, id=5945a84ef1fa6dcd-041277498bf1b09a, job_id=-1, txn_id=2, label=aa283492-f0f5-4e12-9d98-77361a3eddf1, elapse(s)=0
I20240326 09:27:12.068594 3595119 tablets_channel.cpp:134] open tablets channel: (load_id=5945a84ef1fa6dcd-041277498bf1b09a, index_id=10177), tablets num: 12, timeout(s): 259200
I20240326 09:27:12.598863 3595895 daemon.cpp:215] os physical memory 254.09 GB. process memory used 1.55 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 64.51 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
...
I20240326 09:27:20.736198 3595895 daemon.cpp:215] os physical memory 254.09 GB. process memory used 3.27 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 61.86 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:27:21.135390 3594502 load_channel_mgr.cpp:227] cleaning timed out load channels
I20240326 09:27:21.185364 3594603 storage_engine.cpp:1077] start to delete unused rowset, size: 0
I20240326 09:27:21.185497 3594603 storage_engine.cpp:1110] collected 0 unused rowsets to remove, skipped 0 rowsets due to use count > 1, skipped 0 rowsets due to don't need to delete file, skipped 0 rowsets due to delayed expired timestamp.
I20240326 09:27:21.185508 3594603 storage_engine.cpp:1126] removed all collected unused rowsets
I20240326 09:27:21.187103 3594641 olap_server.cpp:1104] cooldown producer get tablet num: 0
I20240326 09:27:21.350066 3594503 wal_manager.cpp:473] Scheduled(every 10s) WAL info: [/opt/software/apache-doris-2.1.0-bin-arm64/be/storage/wal: limit 54093053952 Bytes, used 0 Bytes, estimated wal bytes 0 Bytes, available 54093053952 Bytes.];
I20240326 09:27:22.142727 3595895 daemon.cpp:215] os physical memory 254.09 GB. process memory used 3.65 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 61.60 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
...
I20240326 09:27:39.731673 3595895 daemon.cpp:215] os physical memory 254.09 GB. process memory used 6.18 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 60.06 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:27:39.776759 3598885 topic_subscriber.cpp:43] begin handle topic info
I20240326 09:27:39.776834 3598885 workload_group_listener.cpp:55] update task group finish, tg info=TG[id = 1, name = normal, cpu_share = 1024, memory_limit = 68.60 GB, enable_memory_overcommit = true, version = 0, cpu_hard_limit = -1, scan_thread_num = 192, max_remote_scan_thread_num = 192, min_remote_scan_thread_num = 192], enable_cpu_hard_limit=false, cgroup cpu_shares=0, cgroup cpu_hard_limit=0, enable_cgroup_cpu_soft_limit=true, cgroup home path=
I20240326 09:27:39.776870 3598885 cgroup_cpu_ctl.cpp:32] doris cgroup cpu path is not specify, path=
I20240326 09:27:39.776883 3598885 task_group_manager.cpp:122] init task group mgr cpu ctl failed, [INTERNAL_ERROR]doris cgroup cpu path is not specify.
I20240326 09:27:39.776896 3598885 task_group_manager.cpp:133] finish clear unused task group, time cost: 0ms, deleted group size:0
I20240326 09:27:39.776906 3598885 topic_subscriber.cpp:48] handle topic WORKLOAD_GROUP successfully
I20240326 09:27:39.776912 3598885 workload_sched_policy_listener.cpp:73] [workload_schedule]finish update workload schedule policy, size=0
I20240326 09:27:39.776922 3598885 topic_subscriber.cpp:48] handle topic WORKLOAD_SCHED_POLICY successfully
I20240326 09:27:39.931135 3600597 vtablet_writer.cpp:961] VNodeChannel[10177-10009], load_id=5945a84ef1fa6dcd-41277498bf1b09a, txn_id=2, node=192.168.9.93:38060 mark closed, left pending batch size: 24
I20240326 09:27:39.931187 3600597 vtablet_writer.cpp:961] VNodeChannel[10177-10008], load_id=5945a84ef1fa6dcd-41277498bf1b09a, txn_id=2, node=192.168.9.92:38060 mark closed, left pending batch size: 24
I20240326 09:27:39.931200 3600597 vtablet_writer.cpp:961] VNodeChannel[10177-10007], load_id=5945a84ef1fa6dcd-41277498bf1b09a, txn_id=2, node=192.168.9.91:38060 mark closed, left pending batch size: 1
I20240326 09:27:39.940634 3594823 tablets_channel.cpp:226] close tablets channel: (load_id=5945a84ef1fa6dcd-041277498bf1b09a, index_id=10177), sender id: 0, backend id: 10007
I20240326 09:27:40.137478 3595895 daemon.cpp:215] os physical memory 254.09 GB. process memory used 6.48 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 59.79 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:27:40.432672 3595731 vtablet_writer.cpp:1006] All node channels are stopped(maybe finished/offending/cancelled), sender thread exit. 5945a84ef1fa6dcd-41277498bf1b09a
I20240326 09:27:41.044422 3595895 daemon.cpp:215] os physical memory 254.09 GB. process memory used 6.10 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 60.48 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:27:41.187397 3594641 olap_server.cpp:1104] cooldown producer get tablet num: 0
I20240326 09:27:41.381980 3594503 wal_manager.cpp:473] Scheduled(every 10s) WAL info: [/opt/software/apache-doris-2.1.0-bin-arm64/be/storage/wal: limit 53779223756 Bytes, used 0 Bytes, estimated wal bytes 0 Bytes, available 53779223756 Bytes.];
I20240326 09:27:41.559689 3594823 load_channel.cpp:55] load channel removed load_id=5945a84ef1fa6dcd-041277498bf1b09a, is high priority=0, sender_ip=192.168.9.91, index id: 10177, total_received_rows: 15000000, num_rows_filtered: 0
I20240326 09:27:41.950939 3595895 daemon.cpp:215] os physical memory 254.09 GB. process memory used 5.81 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 60.94 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:27:42.273818 3600597 vtablet_writer.cpp:1504] total mem_exceeded_block_ns=14547370890, total queue_push_lock_ns=92232080, total actual_consume_ns=19462089200, load id=5945a84ef1fa6dcd-41277498bf1b09a
I20240326 09:27:42.273862 3600597 vtablet_writer.cpp:1551] finished to close olap table sink. load_id=5945a84ef1fa6dcd-41277498bf1b09a, txn_id=2, node add batch time(ms)/wait execution time(ms)/close time(ms)/num: {10007:(8835)(31)(2342)(1231)} {10008:(7567)(20)(2342)(1231)} {10009:(7722)(23)(2258)(1231)}
I20240326 09:27:42.274003 3600177 fragment_mgr.cpp:570] Removing query 5945a84ef1fa6dcd-41277498bf1b09a instance 5945a84ef1fa6dcd-41277498bf1b09b, all done? true
I20240326 09:27:42.274056 3600177 fragment_mgr.cpp:576] Query 5945a84ef1fa6dcd-41277498bf1b09a finished
I20240326 09:27:42.280620 3593103 stream_load_executor.cpp:152] finished to execute stream load. label=aa283492-f0f5-4e12-9d98-77361a3eddf1, txn_id=2, query_id=0-0, receive_data_cost_ms=1095, read_data_cost_ms=26878, write_data_cost_ms=30214
具体load_channel日志如下:
I20240326 09:33:03.737669 3637924 tablets_channel.cpp:134] open tablets channel: (load_id=b34e00a0430ae32d-6219d209907d6a95, index_id=10381), tablets num: 12, timeout(s): 259200
I20240326 09:33:05.474531 3638686 daemon.cpp:215] os physical memory 254.09 GB. process memory used 1.58 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 64.41 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:33:05.861790 3637408 storage_engine.cpp:1077] start to delete unused rowset, size: 0
I20240326 09:33:05.861830 3637408 storage_engine.cpp:1110] collected 0 unused rowsets to remove, skipped 0 rowsets due to use count > 1, skipped 0 rowsets due to don't need to delete file, skipped 0 rowsets due to delayed expired timestamp.
I20240326 09:33:05.861840 3637408 storage_engine.cpp:1126] removed all collected unused rowsets
I20240326 09:33:05.952747 3637304 wal_manager.cpp:473] Scheduled(every 10s) WAL info: [/opt/software/apache-doris-2.1.0-bin-arm64/be/storage/wal: limit 53952324403 Bytes, used 0 Bytes, estimated wal bytes 0 Bytes, available 53952324403 Bytes.];
I20240326 09:33:08.082522 3638686 daemon.cpp:215] os physical memory 254.09 GB. process memory used 1.85 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 63.91 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:33:09.721789 3638865 backend_service.cpp:665] get_batch stream_load_record rocksdb successfully. records size: 0, last_stream_load_timestamp: -1
I20240326 09:33:09.811273 3638865 topic_subscriber.cpp:43] begin handle topic info
I20240326 09:33:09.811316 3638865 workload_group_listener.cpp:55] update task group finish, tg info=TG[id = 1, name = normal, cpu_share = 1024, memory_limit = 68.60 GB, enable_memory_overcommit = true, version = 0, cpu_hard_limit = -1, scan_thread_num = 192, max_remote_scan_thread_num = 192, min_remote_scan_thread_num = 192], enable_cpu_hard_limit=false, cgroup cpu_shares=0, cgroup cpu_hard_limit=0, enable_cgroup_cpu_soft_limit=true, cgroup home path=
I20240326 09:33:09.811360 3638865 cgroup_cpu_ctl.cpp:32] doris cgroup cpu path is not specify, path=
I20240326 09:33:09.811372 3638865 task_group_manager.cpp:122] init task group mgr cpu ctl failed, [INTERNAL_ERROR]doris cgroup cpu path is not specify.
I20240326 09:33:09.811383 3638865 task_group_manager.cpp:133] finish clear unused task group, time cost: 0ms, deleted group size:0
I20240326 09:33:09.811389 3638865 topic_subscriber.cpp:48] handle topic WORKLOAD_GROUP successfully
I20240326 09:33:09.811399 3638865 workload_sched_policy_listener.cpp:73] [workload_schedule]finish update workload schedule policy, size=0
I20240326 09:33:09.811407 3638865 topic_subscriber.cpp:48] handle topic WORKLOAD_SCHED_POLICY successfully
I20240326 09:33:10.390019 3638686 daemon.cpp:215] os physical memory 254.09 GB. process memory used 2.13 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 63.38 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
...
I20240326 09:33:15.863443 3637446 olap_server.cpp:1104] cooldown producer get tablet num: 0
I20240326 09:33:15.964967 3637304 wal_manager.cpp:473] Scheduled(every 10s) WAL info: [/opt/software/apache-doris-2.1.0-bin-arm64/be/storage/wal: limit 53952336691 Bytes, used 0 Bytes, estimated wal bytes 0 Bytes, available 53952336691 Bytes.];
I20240326 09:33:16.815279 3638686 daemon.cpp:215] os physical memory 254.09 GB. process memory used 3.37 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 61.90 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
...
I20240326 09:33:34.297698 3637618 tablets_channel.cpp:226] close tablets channel: (load_id=b34e00a0430ae32d-6219d209907d6a95, index_id=10381), sender id: 0, backend id: 10007
I20240326 09:33:34.988812 3638686 daemon.cpp:215] os physical memory 254.09 GB. process memory used 5.34 GB, limit 228.68 GB, soft limit 205.81 GB. sys available memory 61.55 GB, low water mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
I20240326 09:33:35.804965 3637618 load_channel.cpp:55] load channel removed load_id=b34e00a0430ae32d-6219d209907d6a95, is high priority=0, sender_ip=192.168.9.92, index id: 10381, total_received_rows: 15000000, num_rows_filtered: 0
I20240326 09:33:35.816023 3638865 task_worker_pool.cpp:328] successfully submit task|type=PUBLISH_VERSION|signature=3
I20240326 09:33:35.821591 3637473 engine_publish_version_task.cpp:395] publish version successfully on tablet, table_id=10380, tablet=10386, transaction_id=3, version=2, num_rows=1250558, res=[OK], cost: 5507(us)
...
I20240326 09:33:35.827634 3637474 engine_publish_version_task.cpp:395] publish version successfully on tablet, table_id=10380, tablet=10414, transaction_id=3, version=2, num_rows=1249754, res=[OK], cost: 11493(us)
I20240326 09:33:35.827831 3637516 engine_publish_version_task.cpp:319] finish to publish version on transaction.transaction_id=3, cost(us): 11824, error_tablet_size=0, res=[OK]
I20240326 09:33:35.827873 3637516 task_worker_pool.cpp:1558] successfully publish version|signature=3|transaction_id=3|tablets_num=12|cost(s)=0
I20240326 09:33:35.840996 3637303 load_channel_mgr.cpp:227] cleaning timed out load channels
I20240326 09:33:35.861904 3637408 storage_engine.cpp:1077] start to delete unused rowset, size: 0
I20240326 09:33:35.861924 3637408 storage_engine.cpp:1110] collected 0 unused rowsets to remove, skipped 0 rowsets due to use count > 1, skipped 0 rowsets due to don't need to delete file, skipped 0 rowsets due to delayed expired timestamp.
I20240326 09:33:35.861932 3637408 storage_engine.cpp:1126] removed all collected unused rowsets
I20240326 09:33:35.863641 3637446 olap_server.cpp:1104] cooldown producer get tablet num: 0