聚合表的bitmap列如何创建物化视图来加速查询

Viewed 21

版本:Server version: 5.7.99 Doris version doris-2.1.5-rc02-d5a02e095d

需求:广告业务需要对每天的几十亿访问量按照国家和操作系统算uv.pv
建表:

CREATE TABLE IF NOT EXISTS access_pre_agg
(
    `date` DATE NOT NULL COMMENT "活跃日历日期",
    `channel` VARCHAR(20) COMMENT "adx/ams",
    `country` VARCHAR(20) COMMENT "设备所属国家",
    `os` VARCHAR(40) COMMENT "操作系统",
    `dev_suffix4` VARCHAR(10) COMMENT "设备ID的后4位字符,用于分桶",
    `daily_active` INT SUM DEFAULT "0" COMMENT "每日活跃次数",
    `dev_hll` hll HLL_UNION COMMENT "设备hll用于模糊聚合",
    `dev_bitmap` bitmap BITMAP_UNION COMMENT "设备hll用于精确去重"
)
AGGREGATE KEY(`date`, `channel`, `country`, `os`,dev_suffix4)
PARTITION BY RANGE(`date`) ()
DISTRIBUTED BY HASH(`dev_suffix4`) BUCKETS 100
......

使用 stream load 做数据导入,
dev_hll=HLL_HASH(...)
dev_bitmap=bitmap_hash64(...)

mysql> create materialized view mv_dev_agg as
    -> select `date`,`channel`,`country`,`os`,sum(daily_active) pv, count(distinct dev_bitmap) from access_pre_agg group by `date`,`channel`,`country`,`os`;
ERROR 1105 (HY000): errCode = 2, detailMessage = Aggregate function require single slot argument, invalid argument is: CASE WHEN `dev_bitmap` IS NULL THEN 0 ELSE 1 END

如上,创建 bitmap的预聚合时报错,

请教:bitmap是否支持这种用法,若支持应该怎么使用?

目的是加速:

select date,count(distinct dev_bitmap) uv from access_pre_agg group by date order by date;

查询

1 Answers

这样试试:count(distinct dev_bitmap) =》 bitmap_union(dev_bitmap)

create materialized view mv_dev_agg as select date,channel,country,os,sum(daily_active) pv, bitmap_union(dev_bitmap)from access_pre_agg group by date,channel,country,os;

explain select date,count(distinct dev_bitmap) uv from access_pre_agg group by date order by date;
查看是否走了物化视图