with关键词的sql每次得到的结果条数不一样

Viewed 43

基本信息:

  1. doris版本: 2.1.7;
  2. 内表t4是一张duplicate表;
  3. 当前doris没有任何数据写入;

问题描述

情况一: 分别统计每个临时表的结果记录数
分别定义三张临时表,tmp3的结果来自于tmp1和tmp2的join,sql表达如下:

with tmp1 as (select * from t1 ....)
with tmp2 as (select * from t2 ....)
with tmp3 as (select * from t3 join tmp1 , tmp2  ....)

使用一条sql对三张临时表的条数进行统计:

select count(*) from tmp1
union select count(*) from tmp2
union select count(*) from tmp3 

情况二: 基本情况与一保持一致,最后将tmp3 的结果写入到内表t4, sql表达如下:

with tmp1 as (select * from t1 ....)
with tmp2 as (select * from t2 ....)
with tmp3 as (select * from t3 join tmp1 , tmp2  ....)

结果保存:
insert into t4 select * from tmp3

情况三
将临时表 tmp1, tmp2, 和tmp3 变换成物理表,则不会出现数据不一致的情况。

问题和现象:

  1. 一中的统计每次返回的结果都是一样的;
  2. 二中每次插入到 t4 的结果都是不一样的;

理论上,每次执行sql t4的记录数是固定的,并且和一中的tmp3输出的条数一致。

下面是执行过程记录:
企业微信截图_17416600254871.png

听社区同学建议,开启严格模式后重试,仍然得不到正确结果:
企业微信截图_17416602907862.png

补充其他的信息:

  1. 环境变量:
    企业微信截图_17416621707665.png

  2. 当前环境已经禁用sql cache,所以情况一中不存在cache命中问题。每次的查询结果都是重新执行的。

  3. 尝试开启profile,分析看那个步骤返回的数据有问题,但是没在profile中找到有用的信息,profile如下:

Summary:
      -  Profile  ID:  5b72d0b79a4f4353-93a6459353c46546
      -  Task  Type:  LOAD
      -  Start  Time:  2025-03-11  10:54:13
      -  End  Time:  2025-03-11  10:54:29
      -  Total:  16s273ms
      -  Task  State:  OK
      -  User:  root
      -  Default  Catalog:  internal
      -  Default  Db:  tmp
      -  Sql  Statement:  with  tmp_rawdata_drp_disnotice  AS  
(SELECT
a1.bill_no  as  tz_bill_no,
a1.qty  as  tzqty,
b1.bill_no  AS  dd_bill_no,
b1.qty  AS  ddqty,
c1.bill_no  AS  ph_bill_no,
c1.qty  AS  phqty,
a1.goods_code,
a1.pur_warehouse_code,
a1.sale_warehouse_code
FROM
	dws.dws_rawdata_drp_disnotice  AS  a1  --分销发货通知单
	LEFT  JOIN  dws.dws_rawdata_drp_disorderbill  AS  b1  --分销订单
	ON  a1.source_bill_no  =  b1.bill_no  
	AND  a1.goods_code  =  b1.goods_code  
	AND  a1.pur_warehouse_code  =  b1.pur_warehouse_code
	and  b1.status  in  ('01','02','03','04','05')
	LEFT  JOIN  dws.dws_rawdata_drp_disbu_stk_bill  AS  c1  --铺货单
	ON  a1.goods_code  =  c1.goods_code  
	AND  a1.source_bill_no  =  c1.bill_no  	
	AND  a1.pur_warehouse_code  =  c1.pur_warehouse_code
	and  c1.status  in  ('01','02','03')
where  a1.status  in  ('01','02','03','04','05','15')  ),

tmp_rawdata_drp_directsend  AS  
(SELECT
	a2.bill_no  as  zf_bill_no,
	a2.qty  as  zfqty,
	b2.bill_no  AS  ph_bill_no,
	b2.qty  AS  phqty,
	a2.goods_code,
		a2.pur_warehouse_code,
	a2.sale_warehouse_code	  
FROM
	dws.dws_rawdata_drp_directsend  AS  a2  --直发单
	LEFT  JOIN  dws.dws_rawdata_drp_disbu_stk_bill  AS  b2  --铺货单
	ON  b2.bill_no  =  a2.source_bill_no  
	AND  a2.goods_code  =  b2.goods_code
		and  a2.sale_warehouse_code=b2.sale_warehouse_code
	and  a2.pur_warehouse_code=b2.pur_warehouse_code
	AND  a2.goods_code  =  b2.goods_code
		and  b2.status  in  ('01','02','03')
where  a2.status  in  ('01','02','03','04','05','15')),

tmp_rawdata  AS  
(SELECT  A.bill_no,
	A.status_name,
	a.send_time,a.complete_time,
	b.warehouse_code  sale_warehouse_code,
	b.warehouse_name  sale_warehouse_name,
	C.warehouse_code  pur_warehouse_code,
	C.warehouse_name  pur_warehouse_name,
	C.business_nature,
	d.goods_category1,
	d.total_brand  ,
	d.goods_brand  ,
	d.goods_category2  ,
	d.goods_year  ,
	d.goods_season  ,
	d.goods_spu  ,
	d.goods_band  ,
	d.goods_size  ,
	d.goods_code  ,
	A.qty  ,
	e.tz_bill_no  ,
	e.tzqty  ,
	e.dd_bill_no  ,
	e.ddqty  ,
	e.ph_bill_no  disnotice_ph_bill_no,
	e.phqty  disnotice_phqty,
	h.zf_bill_no  ,
	h.zfqty  ,
	h.ph_bill_no  directsend_ph_bill_no,
	h.phqty    directsend_phqty,
	w.zf_bill_no  order_zf_bill_no,
    w.zfqty  order_zfqty,
	a.bill_date
FROM
	dws.dws_rawdata_drp_disbill  AS  A  --分销发货单
	LEFT  JOIN  dim.dim_warehouse_info  AS  b  ON  A.sale_warehouse_code  =  b.warehouse_code  --仓库档案发货仓
    LEFT  JOIN  dim.dim_warehouse_info  AS  C  ON  A.pur_warehouse_code  =  C.warehouse_code  --仓库档案收货仓
	LEFT  JOIN  dim.dim_goods_info  AS  d  ON  A.goods_code  =  d.goods_code  --商品档案
	LEFT  JOIN  tmp_rawdata_drp_disnotice  AS  e  ON  e.tz_bill_no  =  A.source_bill_no  
	AND  e.goods_code  =  A.goods_code  
	and  a.sale_warehouse_code=e.sale_warehouse_code
    and  a.pur_warehouse_code=e.pur_warehouse_code--分销发货通知单
	LEFT  JOIN  tmp_rawdata_drp_directsend  AS  h  ON  h.zf_bill_no  =  A.source_bill_no  
	AND  h.goods_code  =  A.goods_code  
	and  a.sale_warehouse_code=h.sale_warehouse_code
    and  a.pur_warehouse_code=h.pur_warehouse_code--直发单来源单号
	LEFT  JOIN  tmp_rawdata_drp_directsend  AS  w  ON  w.zf_bill_no  =  A.source_order_bill_no  
	AND  w.goods_code  =  A.goods_code  
	and  a.sale_warehouse_code=w.sale_warehouse_code
    and  a.pur_warehouse_code=w.pur_warehouse_code--直发单来源订单号
WHERE  a.status  in  ('01','02','05','07','08'))


INSERT  INTO    ads.ads_rpt_distribution_variance_issue(bill_no,status_name,send_time,complete_time,sale_warehouse_code,sale_warehouse_name,pur_warehouse_code,pur_warehouse_name,business_nature,goods_category1,total_brand,goods_brand,goods_category2,goods_year,goods_season,goods_spu,goods_band,goods_size,goods_code,qty,tz_bill_no,tzqty,dd_bill_no,ddqty,disnotice_ph_bill_no,disnotice_phqty,zf_bill_no,zfqty,directsend_ph_bill_no,directsend_phqty,order_zf_bill_no,order_zfqty,bill_date)
SELECT  bill_no,status_name,send_time,complete_time,sale_warehouse_code,sale_warehouse_name,pur_warehouse_code,pur_warehouse_name,business_nature,goods_category1,total_brand,goods_brand,goods_category2,goods_year,goods_season,goods_spu,goods_band,goods_size,goods_code,qty,tz_bill_no,tzqty,dd_bill_no,ddqty,disnotice_ph_bill_no,disnotice_phqty,zf_bill_no,zfqty,directsend_ph_bill_no,directsend_phqty,order_zf_bill_no,order_zfqty,bill_date  FROM  tmp_rawdata
Execution  Summary:
      -  Parse  SQL  Time:  2ms
      -  Nereids  Analysis  Time:  N/A
      -  Nereids  Rewrite  Time:  N/A
      -  Nereids  Optimize  Time:  N/A
      -  Nereids  Translate  Time:  N/A
      -  Workload  Group:  offline_etl
      -  Analysis  Time:  N/A
      -  Plan  Time:  N/A
          -  JoinReorder  Time:  N/A
          -  CreateSingleNode  Time:  N/A
          -  QueryDistributed  Time:  N/A
          -  Init  Scan  Node  Time:  N/A
          -  Finalize  Scan  Node  Time:  N/A
              -  Get  Splits  Time:  N/A
                  -  Get  Partitions  Time:  N/A
                  -  Get  Partition  Files  Time:  N/A
              -  Create  Scan  Range  Time:  N/A
      -  Schedule  Time:  N/A
          -  Fragment  Assign  Time:  N/A
          -  Fragment  Serialize  Time:  N/A
          -  Fragment  RPC  Phase1  Time:  N/A
          -  Fragment  RPC  Phase2  Time:  N/A
          -  Fragment  Compressed  Size:  0.00  
          -  Fragment  RPC  Count:  0
      -  Schedule  Time  Of  BE:  {}
      -  Wait  and  Fetch  Result  Time:  N/A
          -  Fetch  Result  Time:  0ms
          -  Write  Result  Time:  0ms
      -  Doris  Version:  doris-2.1.7-rc03-443e87e203
      -  Is  Nereids:  Yes
      -  Is  Pipeline:  Yes
      -  Is  Cached:  No
      -  Total  Instances  Num:  0
      -  Instances  Num  Per  BE:  
      -  Parallel  Fragment  Exec  Instance  Num:  8
      -  Trace  ID:  
      -  Transaction  Commit  Time:  N/A
      -  Executed  By  Frontend:  N/A
      -  Nereids  GarbageCollect  Time:  -1ms
      -  Nereids  BeFoldConst  Time:  0ms

Changed  Session  Variables:
VarName                                                  |  CurrentValue  |  DefaultValue
--------------------------------|--------------|-------------
enable_profile                                    |  true                  |  false              
variable_version                                |  0                        |  100                  
lower_case_table_names                    |  1                        |  0                      
query_timeout                                      |  9000                  |  900                  
enable_sync_runtime_filter_size  |  false                |  true                


1 Answers

大佬,这个能整理个复现的case吗?我们本地复现下看看