Doris版本2.1.5
BE服务器配置:16核32GB
我们使用了workload group机制,且开启了CPU和内存硬限制
workload group默认组(normal)
memory_limit: 34%
cpu_hard_limit: 20%
workload group定时任务组(ad_group)
memory_limit: 45%
cpu_hard_limit: 65%
说明:定时任务使用的都是ad_group组下的资源查询;个人账号查询都用normal组下的资源查询
现象:定时任务SQL正常情况下都是4/5s就出结果,当个人账号有大查询SQL时,如果有出现超时的任务,必然是包含count(distinct)的定时任务
查询SQL如下
# 定时任务SQL包含count(distinct)(5分钟执行一次,超时时间30s)
select count(1) as imp, count(DISTINCT mac) as uv
from ad_monitor.huan_ad_monitor_impression_log
where dt = '2024-08-07' and platform in ('huanunion_direct','huanunion_adx')
# 个人大查询SQL
SELECT SUBSTRING(m1.city_code, 5, 4) as cityCode, m1.mac_md5, m1.isp_mark FROM ( (SELECT city_code, mac_md5,isp_mark FROM hive.data_analysis.dim_newtv_gitv_mac ) as m1 JOIN (SELECT a.mac FROM ( SELECT t1.provinceName, t1.mac, t2.mac_md5, t1.pv FROM ( (SELECT provinceName, mac, COUNT(1) as pv FROM hive.ad_monitor.huan_ad_monitor_impression_log_orc WHERE dt BETWEEN '2024-07-15' AND '2024-08-06' AND hpid = '6639' AND platform = 'huanunion_adx' AND provinceName IN ('河北省','安徽省','四川省','广东省','湖北省','江苏省','江西省','重庆市','贵州省','浙江省','山东省','广西壮族自治区') GROUP BY provinceName,mac ) as t1 LEFT JOIN (SELECT mac_md5 FROM hive.huan_union.huan_union_dmp_log_orc WHERE dmp_deal_id = '1148936' AND is_ta = '1' GROUP BY mac_md5) as t2 ON t1.mac = t2.mac_md5 ) ) as a WHERE a.mac_md5 is null AND pv = 1) as m2 ON m1.mac_md5 = m2.mac )
现象:定时任务SQL正常情况下都是4/5s就出结果,当个人账号有大查询SQL时,如果有出现超时的任务,必然是包含count(distinct)的定时任务
资源监控:下图高点为个人账号有大查询SQL