SUBSTRING_INDEX 函数不兼容问题(或为BUG)

Viewed 15

MySQL 与 Doris 环境中返回结果不一致问题

SQL 语句

SELECT t1.no
      ,t1.sub_str AS '分隔符字符串'
      ,t1.str AS '需要截取的字符串'
      ,substring_index(t1.str, t1.sub_str, -1) AS 'substring_index错误的返回结果'
      ,t2.rst2 AS '正确的返回结果'
FROM (
       SELECT 1 AS no, 'BBB' AS sub_str, 'AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06' AS str UNION ALL
       SELECT 2 AS no, 'ccc' AS sub_str, 'zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06' AS str UNION ALL
       SELECT 3 AS no, 'DDD' AS sub_str, 'AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06' AS str UNION ALL
       SELECT 4 AS no, 'DDD' AS sub_str, 'sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06' AS str UNION ALL
       SELECT 5 AS no, 'eee' AS sub_str, 'cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06' AS str UNION ALL
       SELECT 6 AS no, 'A_01' AS sub_str, 'AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06' AS str 
     ) t1
LEFT JOIN (
       -- 手动指定查询
       SELECT 1 AS no, 'BBB' AS sub_str,  substring_index('AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06', 'BBB', -1) AS rst2 UNION ALL
       SELECT 2 AS no, 'ccc' AS sub_str,  substring_index('zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06', 'ccc', -1) AS rst2 UNION ALL
       SELECT 3 AS no, 'DDD' AS sub_str,  substring_index('AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06', 'DDD', -1) AS rst2 UNION ALL
       SELECT 4 AS no, 'DDD' AS sub_str,  substring_index('sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06', 'DDD', -1) AS rst2 UNION ALL
       SELECT 5 AS no, 'eee' AS sub_str,  substring_index('cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06', 'eee', -1) AS rst2 UNION ALL
       SELECT 6 AS no, 'A_01' AS sub_str, substring_index('AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06', 'A_01', -1) AS rst2 
     ) t2
     ON t1.no = t2.no AND t1.sub_str = t2.sub_str
ORDER BY t1.no
;

MySQL 中执行返回结果正确

image.png

Doris 中执行返回结果错误

image.png

原因分析

DOris 官网: substring_index 函数 https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/scalar-functions/string-functions/substring-index?_highlight=substring_index

image.png

分析

针对与 SUBSTRING_INDEX(<content>, <delimiter>, <field>) 函数中 delimiter 分隔符不能是动态的问题, 理论上在使用的过程中, 这里的分隔符可以是动态的吧, 如果不是这样, 不兼容标准的SQL语法(如: MySQL 环境中查询的结果与Doris中的返回结果就是不一致的)

1 Answers

跟进中,有进展会更新回帖。