聚合模型下 不同的select 表达式,性能差异大

Viewed 53

建表语句:

CREATE TABLE `dws_mediago_bidder_dsp_multi_measures_hourly` (
  `campaign_id` int(11) NULL,
  `asset_id` int(11) NULL,
  `ssp` int(11) NULL,
  `ad_id` int(11) NULL,
  `crid` varchar(50) NULL,
  `domain` varchar(2000) NULL,
  `ip_country` varchar(200) NULL,
  `account_id` varchar(32) NULL,
  `account_name` varchar(500) NULL,
  `company_id` varchar(32) NULL,
  `company_name` varchar(500) NULL,
  `am_name` varchar(50) NULL,
  `platform_type` varchar(10) NULL,
  `account_category` varchar(100) NULL,
  `company_region` varchar(20) NULL,
  `charge_type` varchar(20) NULL,
  `target_cpa` DECIMAL(15, 9) NULL,
  `d_s` date NULL,
  `h_s` int(11) NULL,
  `ad_count` bigint(20) SUM NULL,
  `all_req_num` bigint(20) SUM NULL,
  `account_gross_click_cost` double SUM NULL,
  `click` bigint(20) SUM NULL,
  `click_cost` double SUM NULL,
  `conversion` bigint(20) SUM NULL,
  `cv` bigint(20) SUM NULL,
  `imp` bigint(20) SUM NULL,
  `imp_cost` double SUM NULL,
  `vimp` bigint(20) SUM NULL,
  `mcv` bigint(20) SUM NULL,
  `flr_sum_fix` double SUM NULL,
  `bid_price_sum` double SUM NULL,
  `req_num` double SUM NULL,
  `prctr` double SUM NULL,
  `pclick` double SUM NULL,
  `req_ad_num` bigint(20) SUM NULL
) ENGINE=OLAP
AGGREGATE KEY(`campaign_id`, `asset_id`, `ssp`, `ad_id`, `crid`, `domain`, `ip_country`, `account_id`, `account_name`, `company_id`, `company_name`, `am_name`, `platform_type`, `account_category`, `company_region`, `charge_type`, `target_cpa`, `d_s`, `h_s`)
COMMENT 'OLAP'
PARTITION BY RANGE(`d_s`, `h_s`)()
DISTRIBUTED BY HASH(`campaign_id`) BUCKETS 36
PROPERTIES (
"replication_allocation" = "tag.location.default: 3",
"bloom_filter_columns" = "ssp, domain",
"is_being_synced" = "false",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);

当表达式为sum(xx)时,性能好
image.png

当表达式为sum(if(platform_type='MediaGo',conversion,cv))时,性能差了10倍以上
image.png

两个sqlexplain的结果:
性能差的:

explain select sum(if(platform_type='MediaGo',conversion,cv)) from miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly where d_s='2024-11-08' and h_s='05' and platform_type='MediaGo';
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                                                                                                                                                                                                                                     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                                                                                                                                                                                                                     |
|   OUTPUT EXPRS:                                                                                                                                                                                                                                                                                                     |
|     sum(if((platform_type = 'MediaGo'), conversion, cv))[#40]                                                                                                                                                                                                                                                       |
|   PARTITION: UNPARTITIONED                                                                                                                                                                                                                                                                                          |
|                                                                                                                                                                                                                                                                                                                     |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                                                                                                                                         |
|                                                                                                                                                                                                                                                                                                                     |
|   VRESULT SINK                                                                                                                                                                                                                                                                                                      |
|                                                                                                                                                                                                                                                                                                                     |
|   7098:VAGGREGATE (merge finalize)                                                                                                                                                                                                                                                                                  |
|   |  output: sum(partial_sum(if((platform_type = 'MediaGo'), conversion, cv))[#39])[#40]                                                                                                                                                                                                                            |
|   |  group by:                                                                                                                                                                                                                                                                                                      |
|   |  cardinality=1                                                                                                                                                                                                                                                                                                  |
|   |                                                                                                                                                                                                                                                                                                                 |
|   7095:VEXCHANGE                                                                                                                                                                                                                                                                                                    |
|      offset: 0                                                                                                                                                                                                                                                                                                      |
|                                                                                                                                                                                                                                                                                                                     |
| PLAN FRAGMENT 1                                                                                                                                                                                                                                                                                                     |
|                                                                                                                                                                                                                                                                                                                     |
|   PARTITION: HASH_PARTITIONED: campaign_id[#0]                                                                                                                                                                                                                                                                      |
|                                                                                                                                                                                                                                                                                                                     |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                                                                                                                                         |
|                                                                                                                                                                                                                                                                                                                     |
|   STREAM DATA SINK                                                                                                                                                                                                                                                                                                  |
|     EXCHANGE ID: 7095                                                                                                                                                                                                                                                                                               |
|     UNPARTITIONED                                                                                                                                                                                                                                                                                                   |
|                                                                                                                                                                                                                                                                                                                     |
|   7092:VAGGREGATE (update serialize)                                                                                                                                                                                                                                                                                |
|   |  output: partial_sum(if(platform_type[#36] = 'MediaGo', conversion[#37], cv[#38]))[#39]                                                                                                                                                                                                                         |
|   |  group by:                                                                                                                                                                                                                                                                                                      |
|   |  cardinality=1                                                                                                                                                                                                                                                                                                  |
|   |                                                                                                                                                                                                                                                                                                                 |
|   7081:VOlapScanNode                                                                                                                                                                                                                                                                                                |
|      TABLE: default_cluster:miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly(dws_mediago_bidder_dsp_multi_measures_hourly), PREAGGREGATION: OFF. Reason: Slot(if((platform_type = 'mediago'), conversion, cv)) in sum(if((platform_type = 'MediaGo'), conversion, cv)) is neither key column nor value column. |
|      PREDICATES: platform_type[#12] = 'MediaGo' AND d_s[#17] = '2024-11-08' AND h_s[#18] = 5                                                                                                                                                                                                                        |
|      partitions=1/829 (p2024110805), tablets=36/36, tabletList=22008862,22008866,22008870 ...                                                                                                                                                                                                                       |
|      cardinality=16120362, avgRowSize=0.0, numNodes=1                                                                                                                                                                                                                                                               |
|      pushAggOp=NONE                                                                                                                                                                                                                                                                                                 |
|      projections: platform_type[#12], conversion[#24], cv[#25]                                                                                                                                                                                                                                                      |
|      project output tuple id: 1                                                                                                                                                                                                                                                                                     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

性能好的:

mysql> explain select sum(conversion),sum(cv) from miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly where d_s='2024-11-08' and h_s='05' and platform_type='MediaGo';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                                                                     |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                                                     |
|   OUTPUT EXPRS:                                                                                                                                     |
|     sum(conversion)[#40]                                                                                                                            |
|     sum(cv)[#41]                                                                                                                                    |
|   PARTITION: UNPARTITIONED                                                                                                                          |
|                                                                                                                                                     |
|   HAS_COLO_PLAN_NODE: false                                                                                                                         |
|                                                                                                                                                     |
|   VRESULT SINK                                                                                                                                      |
|                                                                                                                                                     |
|   7084:VAGGREGATE (merge finalize)                                                                                                                  |
|   |  output: sum(partial_sum(conversion)[#38])[#40], sum(partial_sum(cv)[#39])[#41]                                                                 |
|   |  group by:                                                                                                                                      |
|   |  cardinality=1                                                                                                                                  |
|   |                                                                                                                                                 |
|   7081:VEXCHANGE                                                                                                                                    |
|      offset: 0                                                                                                                                      |
|                                                                                                                                                     |
| PLAN FRAGMENT 1                                                                                                                                     |
|                                                                                                                                                     |
|   PARTITION: HASH_PARTITIONED: campaign_id[#0]                                                                                                      |
|                                                                                                                                                     |
|   HAS_COLO_PLAN_NODE: false                                                                                                                         |
|                                                                                                                                                     |
|   STREAM DATA SINK                                                                                                                                  |
|     EXCHANGE ID: 7081                                                                                                                               |
|     UNPARTITIONED                                                                                                                                   |
|                                                                                                                                                     |
|   7078:VAGGREGATE (update serialize)                                                                                                                |
|   |  output: partial_sum(conversion[#36])[#38], partial_sum(cv[#37])[#39]                                                                           |
|   |  group by:                                                                                                                                      |
|   |  cardinality=1                                                                                                                                  |
|   |                                                                                                                                                 |
|   7067:VOlapScanNode                                                                                                                                |
|      TABLE: default_cluster:miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly(dws_mediago_bidder_dsp_multi_measures_hourly), PREAGGREGATION: ON |
|      PREDICATES: platform_type[#12] = 'MediaGo' AND d_s[#17] = '2024-11-08' AND h_s[#18] = 5                                                        |
|      partitions=1/829 (p2024110805), tablets=36/36, tabletList=22008862,22008866,22008870 ...                                                       |
|      cardinality=16120362, avgRowSize=0.0, numNodes=1                                                                                               |
|      pushAggOp=NONE                                                                                                                                 |
|      projections: conversion[#24], cv[#25]                                                                                                          |
|      project output tuple id: 1                                                                                                                     |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+

主要的差异是 VOlapScanNode PREAGGREGATION,性能好的sql是开启的,性能查的是off。

问题:
1.不太明白为什么性能查的sql,无法开启PREAGGREGATION?
2.性能的差异是不是就是PREAGGREGATION的差异导致的?
3.有什么优化的办法?

1 Answers

方便加我微信(hhj_0530)私发profile看一下吗