【已记录】doris2.0.5设置变量查询结果不一致

Viewed 149

doris2.0.5在查询hive并写入doris时在指定变量或者直接写死变量值时,查询、写入结果不一致
查询hive写入doris语句:

drop table  ec_ads_bi.Dim_Card;
CREATE TABLE IF NOT EXISTS ec_ads_bi.Dim_Card(
   PAN	                    varchar(32)      comment	    '卡号'
  ,CardLevel	            varchar(8)       comment	    '卡等级'
  ,CardProduct	            varchar(8)       comment	    '卡产品'
  ,AmexCardIndicator        varchar(8)       comment	    '是否美运卡'
  ,AccountAttribute         varchar(8)       comment	    '账户属性描述'
  ,BINNumber	            varchar(8)       comment	    '卡Bin号, TODO'
  ,CardBin		            varchar(8)	     comment        'CardBIN, TODO'
  ,AccountHolderType	    varchar(10)      comment	    '持卡人账户类型'
  ,AccountLevel             varchar(2)       comment        '账户等级'
  ,CheckStatus              varchar(2)       comment        '是否柜面核身'
  ,PrimaryCardIndicator     varchar(2)       comment        '主附卡标识'
  ,PrimaryCardNumber        varchar(19)      comment        '主卡卡号'
  ,BinRange                 varchar(6)       comment        '六位BIN号, TODO'
  ,CreatedTime              datetime         COMMENT        '创建时间'
  ,UpdatedTime              datetime         COMMENT        '更新时间'
)
UNIQUE KEY(PAN)
COMMENT "Dim_Card"
DISTRIBUTED BY HASH(PAN) BUCKETS 12
PROPERTIES
(
    "replication_allocation" = "tag.location.default: 3"
);


set @DateOfFile='20231212';
set enable_insert_strict=false;
explain
insert into ec_ads_bi.Dim_Card(
   PAN	                
  ,CardLevel	        
  ,CardProduct	        
  ,AmexCardIndicator    
  ,AccountAttribute     
  ,BINNumber	        
  ,CardBin		        
  ,AccountHolderType	
  ,AccountLevel         
  ,CheckStatus          
  ,PrimaryCardIndicator 
  ,PrimaryCardNumber    
  ,BinRange        
  ,CreatedTime
  ,UpdatedTime
  )
select 
   pan	                   as PAN	                
  ,crd_lvl	               as CardLevel	        
  ,crd_prod	               as CardProduct	        
  ,amex_crd_ind	           as AmexCardIndicator    
  ,acct_attr	           as AccountAttribute     
  ,bin_nbr	               as BINNumber	        
  ,crd_bin		           as CardBin		        
  ,acct_hldr_typ           as AccountHolderType	
  ,acct_lvl                as AccountLevel         
  ,check_sts               as CheckStatus          
  ,prim_crd_ind            as PrimaryCardIndicator 
  ,prim_crd_nbr            as PrimaryCardNumber    
  ,bin_range               as BinRange    
  ,current_timestamp()     as  CreatedTime
  ,current_timestamp()     as  UpdatedTime  
from hive.ec_dim.dim_card
where upd_dtm >= str_to_date(@DateOfFile,'yyyyMMdd');

查询Doris结果写入doris,也发现这个现象:
image.png

查询和插入:
image.png

使用变量
---------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| PAN[#17] |
| CardLevel[#18] |
| CardProduct[#19] |
| AmexCardIndicator[#20] |
| AccountAttribute[#21] |
| BINNumber[#22] |
| CardBin[#23] |
| AccountHolderType[#24] |
| AccountLevel[#25] |
| CheckStatus[#26] |
| PrimaryCardIndicator[#27] |
| PrimaryCardNumber[#28] |
| BinRange[#29] |
| CreatedTime[#30] |
| UpdatedTime[#31] |
| DORIS_DELETE_SIGN[#32] |
| DORIS_VERSION_COL[#33] |
| PARTITION: HASH_PARTITIONED: PAN[#0] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| OLAP TABLE SINK |
| TUPLE ID: 2 |
| RANDOM |
| |
| 263:VOlapScanNode |
| TABLE: default_cluster:ec_ads_bi.Dim_Card(Dim_Card), PREAGGREGATION: OFF. Reason: DORIS_DELETE_SIGN is used as conjuncts. |
| PREDICATES: UpdatedTime[#14] >= CAST(str_to_date('20231212', 'yyyyMMdd') AS DATETIMEV2(0)) AND DORIS_DELETE_SIGN[#15] = 0 |
| partitions=1/1 (Dim_Card), tablets=12/12, tabletList=3506800,3506804,3506808 ... |
| cardinality=7274, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| projections: PAN[#0], CardLevel[#1], CardProduct[#2], AmexCardIndicator[#3], AccountAttribute[#4], BINNumber[#5], CardBin[#6], AccountHolderType[#7], AccountLevel[#8], CheckStatus[#9], PrimaryCardIndicator[#10], PrimaryCardNumber[#11], BinRange[#12], CreatedTime[#13], UpdatedTime[#14], 0, 0 |
| project output tuple id: 1 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
不用变量
---------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| PAN[#17] |
| CardLevel[#18] |
| CardProduct[#19] |
| AmexCardIndicator[#20] |
| AccountAttribute[#21] |
| BINNumber[#22] |
| CardBin[#23] |
| AccountHolderType[#24] |
| AccountLevel[#25] |
| CheckStatus[#26] |
| PrimaryCardIndicator[#27] |
| PrimaryCardNumber[#28] |
| BinRange[#29] |
| CreatedTime[#30] |
| UpdatedTime[#31] |
| DORIS_DELETE_SIGN[#32] |
| DORIS_VERSION_COL[#33] |
| PARTITION: HASH_PARTITIONED: PAN[#0] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| OLAP TABLE SINK |
| TUPLE ID: 2 |
| RANDOM |
| |
| 254:VOlapScanNode |
| TABLE: default_cluster:ec_ads_bi.Dim_Card(Dim_Card), PREAGGREGATION: OFF. Reason: DORIS_DELETE_SIGN is used as conjuncts. |
| PREDICATES: UpdatedTime[#14] >= CAST(str_to_date('20231212', 'yyyyMMdd') AS DATETIMEV2(0)) AND DORIS_DELETE_SIGN[#15] = 0 |
| partitions=1/1 (Dim_Card), tablets=12/12, tabletList=3506800,3506804,3506808 ... |
| cardinality=7274, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| projections: PAN[#0], CardLevel[#1], CardProduct[#2], AmexCardIndicator[#3], AccountAttribute[#4], BINNumber[#5], CardBin[#6], AccountHolderType[#7], AccountLevel[#8], CheckStatus[#9], PrimaryCardIndicator[#10], PrimaryCardNumber[#11], BinRange[#12], CreatedTime[#13], UpdatedTime[#14], 0, 0 |
| project output tuple id: 1 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

4 Answers

可以 explain 一下 insert 语句看下结果吗

使用变量时,执行计划:

Explain String(Old Planner)                                                             |
----------------------------------------------------------------------------------------|
PLAN FRAGMENT 0                                                                         |
  OUTPUT EXPRS:                                                                         |
    `pan`                                                                               |
    `crd_lvl`                                                                           |
    `crd_prod`                                                                          |
    `amex_crd_ind`                                                                      |
    `acct_attr`                                                                         |
    `bin_nbr`                                                                           |
    `crd_bin`                                                                           |
    `acct_hldr_typ`                                                                     |
    `acct_lvl`                                                                          |
    `check_sts`                                                                         |
    `prim_crd_ind`                                                                      |
    `prim_crd_nbr`                                                                      |
    `bin_range`                                                                         |
    '2024-03-15 15:36:57'                                                               |
    '2024-03-15 15:36:57'                                                               |
    0                                                                                   |
    0                                                                                   |
  PARTITION: RANDOM                                                                     |
                                                                                        |
  HAS_COLO_PLAN_NODE: false                                                             |
                                                                                        |
  OLAP TABLE SINK                                                                       |
    TUPLE ID: 0                                                                         |
    RANDOM                                                                              |
                                                                                        |
  0:VHIVE_SCAN_NODE                                                                     |
     table: dim_card                                                                    |
     predicates: `upd_dtm` >= CAST(str_to_date('20231212', 'yyyyMMdd') AS DATETIMEV2(6))|
     inputSplitNum=1, totalFileSize=830238, scanRanges=1                                |
     partition=1/1                                                                      |
     numNodes=5                                                                         |
     pushdown agg=NONE                                                                  |
     limit: 2000                                                                        |