建表语句:
CREATE TABLE `dws_mediago_bidder_dsp_multi_measures_hourly` (
`campaign_id` int(11) NULL,
`asset_id` int(11) NULL,
`ssp` int(11) NULL,
`ad_id` int(11) NULL,
`crid` varchar(50) NULL,
`domain` varchar(2000) NULL,
`ip_country` varchar(200) NULL,
`account_id` varchar(32) NULL,
`account_name` varchar(500) NULL,
`company_id` varchar(32) NULL,
`company_name` varchar(500) NULL,
`am_name` varchar(50) NULL,
`platform_type` varchar(10) NULL,
`account_category` varchar(100) NULL,
`company_region` varchar(20) NULL,
`charge_type` varchar(20) NULL,
`target_cpa` DECIMAL(15, 9) NULL,
`d_s` date NULL,
`h_s` int(11) NULL,
`ad_count` bigint(20) SUM NULL,
`all_req_num` bigint(20) SUM NULL,
`account_gross_click_cost` double SUM NULL,
`click` bigint(20) SUM NULL,
`click_cost` double SUM NULL,
`conversion` bigint(20) SUM NULL,
`cv` bigint(20) SUM NULL,
`imp` bigint(20) SUM NULL,
`imp_cost` double SUM NULL,
`vimp` bigint(20) SUM NULL,
`mcv` bigint(20) SUM NULL,
`flr_sum_fix` double SUM NULL,
`bid_price_sum` double SUM NULL,
`req_num` double SUM NULL,
`prctr` double SUM NULL,
`pclick` double SUM NULL,
`req_ad_num` bigint(20) SUM NULL
) ENGINE=OLAP
AGGREGATE KEY(`campaign_id`, `asset_id`, `ssp`, `ad_id`, `crid`, `domain`, `ip_country`, `account_id`, `account_name`, `company_id`, `company_name`, `am_name`, `platform_type`, `account_category`, `company_region`, `charge_type`, `target_cpa`, `d_s`, `h_s`)
COMMENT 'OLAP'
PARTITION BY RANGE(`d_s`, `h_s`)()
DISTRIBUTED BY HASH(`campaign_id`) BUCKETS 36
PROPERTIES (
"replication_allocation" = "tag.location.default: 3",
"bloom_filter_columns" = "ssp, domain",
"is_being_synced" = "false",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);
当表达式为sum(xx)时,性能好
当表达式为sum(if(platform_type='MediaGo',conversion,cv))时,性能差了10倍以上
两个sqlexplain的结果:
性能差的:
explain select sum(if(platform_type='MediaGo',conversion,cv)) from miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly where d_s='2024-11-08' and h_s='05' and platform_type='MediaGo';
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner) |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| sum(if((platform_type = 'MediaGo'), conversion, cv))[#40] |
| PARTITION: UNPARTITIONED |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| VRESULT SINK |
| |
| 7098:VAGGREGATE (merge finalize) |
| | output: sum(partial_sum(if((platform_type = 'MediaGo'), conversion, cv))[#39])[#40] |
| | group by: |
| | cardinality=1 |
| | |
| 7095:VEXCHANGE |
| offset: 0 |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: campaign_id[#0] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 7095 |
| UNPARTITIONED |
| |
| 7092:VAGGREGATE (update serialize) |
| | output: partial_sum(if(platform_type[#36] = 'MediaGo', conversion[#37], cv[#38]))[#39] |
| | group by: |
| | cardinality=1 |
| | |
| 7081:VOlapScanNode |
| TABLE: default_cluster:miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly(dws_mediago_bidder_dsp_multi_measures_hourly), PREAGGREGATION: OFF. Reason: Slot(if((platform_type = 'mediago'), conversion, cv)) in sum(if((platform_type = 'MediaGo'), conversion, cv)) is neither key column nor value column. |
| PREDICATES: platform_type[#12] = 'MediaGo' AND d_s[#17] = '2024-11-08' AND h_s[#18] = 5 |
| partitions=1/829 (p2024110805), tablets=36/36, tabletList=22008862,22008866,22008870 ... |
| cardinality=16120362, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| projections: platform_type[#12], conversion[#24], cv[#25] |
| project output tuple id: 1 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
性能好的:
mysql> explain select sum(conversion),sum(cv) from miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly where d_s='2024-11-08' and h_s='05' and platform_type='MediaGo';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner) |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| sum(conversion)[#40] |
| sum(cv)[#41] |
| PARTITION: UNPARTITIONED |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| VRESULT SINK |
| |
| 7084:VAGGREGATE (merge finalize) |
| | output: sum(partial_sum(conversion)[#38])[#40], sum(partial_sum(cv)[#39])[#41] |
| | group by: |
| | cardinality=1 |
| | |
| 7081:VEXCHANGE |
| offset: 0 |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: campaign_id[#0] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 7081 |
| UNPARTITIONED |
| |
| 7078:VAGGREGATE (update serialize) |
| | output: partial_sum(conversion[#36])[#38], partial_sum(cv[#37])[#39] |
| | group by: |
| | cardinality=1 |
| | |
| 7067:VOlapScanNode |
| TABLE: default_cluster:miaoduan.dws_mediago_bidder_dsp_multi_measures_hourly(dws_mediago_bidder_dsp_multi_measures_hourly), PREAGGREGATION: ON |
| PREDICATES: platform_type[#12] = 'MediaGo' AND d_s[#17] = '2024-11-08' AND h_s[#18] = 5 |
| partitions=1/829 (p2024110805), tablets=36/36, tabletList=22008862,22008866,22008870 ... |
| cardinality=16120362, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| projections: conversion[#24], cv[#25] |
| project output tuple id: 1 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
主要的差异是 VOlapScanNode PREAGGREGATION,性能好的sql是开启的,性能查的是off。
问题:
1.不太明白为什么性能查的sql,无法开启PREAGGREGATION?
2.性能的差异是不是就是PREAGGREGATION的差异导致的?
3.有什么优化的办法?