【已解决】2.1版本,物化视图使用 hll_union 导致routine load 报错

Viewed 132

创建一个聚合表,包含HLL列用于算大概的排重数,而后从kafka中通过 routine load 方式导入数据。在没有创建物化视图时,一切工作正常,查询也正常。用物化视图进行查询加速后,routine load报错。


但是创建物化视图加速 hll_union_agg后,routine load 报错

物化视图中是否支持:hll_union,如果支持,我这个应该怎么修复呢。

PS. 单分区按照设备id为key做聚合表,预计数据会有20亿行数据。可以使用hll_hash(设备id)来做大概的排重数统计吧

4 Answers

2.1.3 已fix :https://github.com/apache/doris/pull/33373

升级到2.1.1完成后,问题并没有解决

保留 conf/be.conf后,直接用2.1.1的文件进行启动。确认版本升级完成

show backends;
+-----------+---------------+---------------+--------+----------+----------+--------------------+---------------------+---------------------+-------+----------------------+-----------+------------------+--------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+----------+
| BackendId | Host          | HeartbeatPort | BePort | HttpPort | BrpcPort | ArrowFlightSqlPort | LastStartTime       | LastHeartbeat       | Alive | SystemDecommissioned | TabletNum | DataUsedCapacity | TrashUsedCapcacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | RemoteUsedCapacity | Tag                      | ErrMsg | Version                     | Status                                                                                                                        | HeartbeatFailureCounter | NodeRole |
+-----------+---------------+---------------+--------+----------+----------+--------------------+---------------------+---------------------+-------+----------------------+-----------+------------------+--------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+----------+
| 10236     | 172.31.21.21  | 9050          | 9060   | 8040     | 8060     | -1                 | 2024-04-03 13:49:15 | 2024-04-05 08:48:49 | true  | false                | 422       | 341.756 GB       | 0.000              | 158.363 GB    | 549.919 GB    | 71.20 % | 71.20 %        | 322.203 GB         | {"location" : "default"} |        | doris-2.1.1-rc05-9f2520537f | {"lastSuccessReportTabletsTime":"2024-04-05 08:48:57","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix      |
| 10312     | 172.31.21.229 | 9050          | 9060   | 8040     | 8060     | -1                 | 2024-04-03 13:46:36 | 2024-04-05 08:48:49 | true  | false                | 422       | 343.931 GB       | 0.000              | 161.469 GB    | 549.919 GB    | 70.64 % | 70.64 %        | 320.873 GB         | {"location" : "default"} |        | doris-2.1.1-rc05-9f2520537f | {"lastSuccessReportTabletsTime":"2024-04-05 08:48:29","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix      |
| 10313     | 172.31.22.135 | 9050          | 9060   | 8040     | 8060     | -1                 | 2024-04-03 13:44:41 | 2024-04-05 08:48:49 | true  | false                | 422       | 344.733 GB       | 0.000              | 101.333 GB    | 549.919 GB    | 81.57 % | 81.57 %        | 320.942 GB         | {"location" : "default"} |        | doris-2.1.1-rc05-9f2520537f | {"lastSuccessReportTabletsTime":"2024-04-05 08:48:56","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix      |
+-----------+---------------+---------------+--------+----------+----------+--------------------+---------------------+---------------------+-------+----------------------+-----------+------------------+--------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+----------+

物化视图添加 hll_union前工作正常

show routine load \G
*************************** 1. row ***************************
                  Id: 45939
                Name: ld_dev_info_agg
          CreateTime: 2024-04-05 03:40:58
           PauseTime: NULL
             EndTime: NULL
              DbName: 
           TableName: dev_info_agg
        IsMultiTable: false
               State: RUNNING
      DataSourceType: KAFKA
      CurrentTaskNum: 2
       JobProperties: {"max_batch_rows":"300000","timezone":"Europe/London","send_batch_parallelism":"1","load_to_single_tablet":"false","current_concurrent_number":"2","delete":"*","partial_columns":"false","merge_type":"APPEND","exec_mem_limit":"2147483648","strict_mode":"false","jsonpaths":"","max_batch_interval":"20","max_batch_size":"209715200","fuzzy_parse":"false","partitions":"*","columnToColumnExpr":"date,os,source,country,region,city,ifa,last_ip,user_agent,finger=md5sum(`last_ip`, `user_agent`),active_cnt=1,ifa_bitmap=bitmap_hash64(`ifa`),ifa_hll=hll_hash(`ifa`),finger_bitmap=bitmap_hash64(md5sum(`last_ip`, `user_agent`)),finger_hll=hll_hash(md5sum(`last_ip`, `user_agent`))","whereExpr":"*","desired_concurrent_number":"2","precedingFilter":"*","format":"json","max_error_number":"200000","max_filter_ratio":"1.0","json_root":"","strip_outer_array":"false","num_as_string":"false"}
DataSourceProperties: {"topic":"device_filter","currentKafkaPartitions":"0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15","brokerList":""}
    CustomProperties: {"kafka_default_offsets":"OFFSET_END","group.id":"sync_device_filter_dev_info_agg"}
           Statistic: {"receivedBytes":580679800634,"runningTxns":[645100,645104],"errorRows":0,"committedTaskNum":20370,"loadedRows":628151571,"loadRowsRate":33727,"abortedTaskNum":326,"errorRowsAfterResumed":0,"totalRows":628151571,"unselectedRows":0,"receivedBytesRate":31178346,"taskExecuteTimeMs":18624458}
            Progress: {"0":"18475832357","1":"19273184653","2":"17473136443","3":"18495459771","4":"19268701156","5":"17493451315","6":"18454284822","7":"19269903553","8":"15746949576","9":"15879971953","10":"15780875090","11":"15745187525","12":"15782310075","13":"15770092292","14":"15767085404","15":"15804488428"}
                 Lag: {"0":4431,"1":2911,"2":14485,"3":27425,"4":1971,"5":1536,"6":4452,"7":3273,"8":17719,"9":23550,"10":1940,"11":1449,"12":4528,"13":2703,"14":13486,"15":26357}
ReasonOfStateChanged:
        ErrorLogUrls:
            OtherMsg:
                User: root
             Comment:

添加物化视图后再次异常

MySQL []> create materialized view dev_info_agg_src_hll_view as select date, os, source, country, sum(active_cnt) as pv, HLL_UNION(ifa_hll) from  dev_info_agg group by date, os,source, country;
Query OK, 0 rows affected (0.02 sec)

MySQL []> show routine load for ld_dev_info_agg\G
*************************** 1. row ***************************
                  Id: 45939
                Name: ld_dev_info_agg
          CreateTime: 2024-04-05 03:40:58
           PauseTime: 2024-04-05 08:52:48
             EndTime: NULL
              DbName: 
           TableName: dev_info_agg
        IsMultiTable: false
               State: PAUSED
      DataSourceType: KAFKA
      CurrentTaskNum: 0
       JobProperties: {"max_batch_rows":"300000","timezone":"Europe/London","send_batch_parallelism":"1","load_to_single_tablet":"false","current_concurrent_number":"2","delete":"*","partial_columns":"false","merge_type":"APPEND","exec_mem_limit":"2147483648","strict_mode":"false","jsonpaths":"","max_batch_interval":"20","max_batch_size":"209715200","fuzzy_parse":"false","partitions":"*","columnToColumnExpr":"date,os,source,country,region,city,ifa,last_ip,user_agent,finger=md5sum(`last_ip`, `user_agent`),active_cnt=1,ifa_bitmap=bitmap_hash64(`ifa`),ifa_hll=hll_hash(`ifa`),finger_bitmap=bitmap_hash64(md5sum(`last_ip`, `user_agent`)),finger_hll=hll_hash(md5sum(`last_ip`, `user_agent`))","whereExpr":"*","desired_concurrent_number":"2","precedingFilter":"*","format":"json","max_error_number":"200000","max_filter_ratio":"1.0","json_root":"","strip_outer_array":"false","num_as_string":"false"}
DataSourceProperties: {"topic":"device_filter","currentKafkaPartitions":"0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15","brokerList":""}
    CustomProperties: {"kafka_default_offsets":"OFFSET_END","group.id":"sync_device_filter_dev_info_agg"}
           Statistic: {"receivedBytes":582687090873,"runningTxns":[645562,645508,645463],"errorRows":0,"committedTaskNum":20380,"loadedRows":630337030,"loadRowsRate":33722,"abortedTaskNum":326,"errorRowsAfterResumed":0,"totalRows":630337030,"unselectedRows":0,"receivedBytesRate":31173528,"taskExecuteTimeMs":18691727}
            Progress: {"0":"18475978854","1":"19273285904","2":"17473271842","3":"18495635849","4":"19268845258","5":"17493558560","6":"18454431821","7":"19270007949","8":"15747088463","9":"15880138486","10":"15781021129","11":"15745292671","12":"15782458746","13":"15770200136","14":"15767221166","15":"15804663038"}
                 Lag: {"0":1670,"1":44336,"2":23370,"3":1125,"4":1507,"5":45541,"6":1797,"7":44274,"8":23002,"9":1313,"10":1908,"11":44759,"12":1776,"13":46959,"14":23266,"15":1058}
ReasonOfStateChanged: ErrorReason{code=errCode = 2, msg='failed to create task: errCode = 2, detailMessage = HLL column must use hll_hash function, like mva_HLL_UNION__`ifa_hll`=hll_hash(xxx)'}
        ErrorLogUrls:
            OtherMsg:
                User: root
             Comment:
1 row in set (0.01 sec)