版本:Server version: 5.7.99 Doris version doris-2.1.5-rc02-d5a02e095d
需求:广告业务需要对每天的几十亿访问量按照国家和操作系统算uv.pv
建表:
CREATE TABLE IF NOT EXISTS access_pre_agg
(
`date` DATE NOT NULL COMMENT "活跃日历日期",
`channel` VARCHAR(20) COMMENT "adx/ams",
`country` VARCHAR(20) COMMENT "设备所属国家",
`os` VARCHAR(40) COMMENT "操作系统",
`dev_suffix4` VARCHAR(10) COMMENT "设备ID的后4位字符,用于分桶",
`daily_active` INT SUM DEFAULT "0" COMMENT "每日活跃次数",
`dev_hll` hll HLL_UNION COMMENT "设备hll用于模糊聚合",
`dev_bitmap` bitmap BITMAP_UNION COMMENT "设备hll用于精确去重"
)
AGGREGATE KEY(`date`, `channel`, `country`, `os`,dev_suffix4)
PARTITION BY RANGE(`date`) ()
DISTRIBUTED BY HASH(`dev_suffix4`) BUCKETS 100
......
使用 stream load 做数据导入,
dev_hll=HLL_HASH(...)
dev_bitmap=bitmap_hash64(...)
mysql> create materialized view mv_dev_agg as
-> select `date`,`channel`,`country`,`os`,sum(daily_active) pv, count(distinct dev_bitmap) from access_pre_agg group by `date`,`channel`,`country`,`os`;
ERROR 1105 (HY000): errCode = 2, detailMessage = Aggregate function require single slot argument, invalid argument is: CASE WHEN `dev_bitmap` IS NULL THEN 0 ELSE 1 END
如上,创建 bitmap的预聚合时报错,
请教:bitmap是否支持这种用法,若支持应该怎么使用?
目的是加速:
select date,count(distinct dev_bitmap) uv from access_pre_agg group by date order by date;
查询