版本:1.2.7
表模型duplicate,里面有个distinct_id字段,字段类型为varchar。这个表按天分区,每天数据量1.5亿左右。执行如下SQL耗时3秒(where条件过滤后的数据是1亿条),业务要求1秒之内返回结果,请问有什么优化的方法吗?
select
pt,
COUNT(case when event = 'page_load' then 1 else null end),
COUNT(distinct case when event = 'page_load' then distinct_id else null end)
from
xxx
where
pt between '2024-04-21' and '2024-05-20'
and event = 'page_load'
group by
pt