关于Oracle full outer join 的bug问题分析及处理
full (outer) join是用来全连接两个表的语法。即希望将A表和B表关联,能够得到A表中有而B表中没有的记录,或者B表中有而A表中没有的记录。
如何判断是否有该记录,则通过on子句来关联。
下面是一个例子:
SQL> with
2 A as(select 1 a, 2 b from dual),
3 B as(select 2 a, 3 b from dual)
4 select * from A full join B
5 on A.a=B.a
6 /
A B A B
---------- ---------- ---------- ----------
1 2
2 3
了解了以上基本原理后,我们应该知道,理论上讲,A表和B表的在from子句中的顺序是没有关系的,也就是不影响结果。但是,实际上,却出现了这样的问题,下面是对这种情况的描述:
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 12791 | 349K| 82 (3)| 00:00:01 |
| 1 | VIEW | | 12791 | 349K| 82 (3)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | FILTER | | | | | |
|* 4 | HASH JOIN RIGHT OUTER| | 12790 | 1124K| 41 (3)| 00:00:01 |
| 5 | TABLE ACCESS FULL | JXC_RISHARESUM | 1735 | 78075 | 7 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | JXC_ALLTRADEDAY | 12790 | 562K| 33 (0)| 00:00:01 |
|* 7 | HASH JOIN ANTI | | 1 | 76 | 41 (3)| 00:00:01 |
|* 8 | TABLE ACCESS FULL | JXC_RISHARESUM | 1 | 45 | 7 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | JXC_ALLTRADEDAY | 12790 | 387K| 33 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
从以上执行计划来看,在第四步骤,使用的是hash join rigth outer连接方式。而通过改变两表的摆放顺序,得到如下的执行计划:
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1876 | 52528 | 82 (3)| 00:00:01 |
| 1 | VIEW | | 1876 | 52528 | 82 (3)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | FILTER | | | | | |
|* 4 | HASH JOIN OUTER | | 1874 | 164K| 41 (3)| 00:00:01 |
| 5 | TABLE ACCESS FULL| JXC_RISHARESUM | 1735 | 78075 | 7 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL| JXC_ALLTRADEDAY | 12790 | 562K| 33 (0)| 00:00:01 |
|* 7 | HASH JOIN ANTI | | 2 | 152 | 41 (3)| 00:00:01 |
|* 8 | TABLE ACCESS FULL | JXC_ALLTRADEDAY | 2 | 90 | 33 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | JXC_RISHARESUM | 1735 | 53785 | 7 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
注意,此时,执行计划中的第四个步骤,变成了:hash join outer方式。这个才是我们所期望的方式。那究竟是什么导致了这个变化呢?查看他们的谓词连接逻辑:
hash join right outer的:
3 - filter(NVL("T1"."C_SHARENO","T2"."C_SHARENO")='26200703200004969020')
4 - access("T2"."D_TRADEDATE"="T1"."D_TRADEDATE"(+) AND
"T2"."D_REGDATE"="T1"."D_REGDATE"(+) AND "T2"."C_FUNDCODE"="T1"."C_FUNDCODE"(+) AND
"T2"."C_FUNDACCO"="T1"."C_FUNDACCO"(+))
7 - access("T2"."C_FUNDACCO"="T1"."C_FUNDACCO" AND
"T2"."C_FUNDCODE"="T1"."C_FUNDCODE" AND "T2"."D_REGDATE"="T1"."D_REGDATE" AND
"T2"."D_TRADEDATE"="T1"."D_TRADEDATE")
8 - filter(NVL("T1"."C_SHARENO",NULL)='26200703200004969020')
hash join outer的:
3 - filter(NVL("T1"."C_SHARENO","T2"."C_SHARENO")='26200703200004969020')
4 - access("T2"."D_TRADEDATE"(+)="T1"."D_TRADEDATE" AND
"T2"."D_REGDATE"(+)="T1"."D_REGDATE" AND "T2"."C_FUNDCODE"(+)="T1"."C_FUNDCODE"
AND "T2"."C_FUNDACCO"(+)="T1"."C_FUNDACCO")
7 - access("T2"."C_FUNDACCO"="T1"."C_FUNDACCO" AND
"T2"."C_FUNDCODE"="T1"."C_FUNDCODE" AND "T2"."D_REGDATE"="T1"."D_REGDATE" AND
"T2"."D_TRADEDATE"="T1"."D_TRADEDATE")
8 - filter("T2"."C_SHARENO"='26200703200004969020')
还是没有发现明显的区别。但是实际却导致了结果的不同。
还原到原始的表连接顺序,然后对两表进行分析,再查看执行计划:
SQL> call dbms_stats.gather_table_stats(user, 'JXC_ALLTRADEDAY');
调用完成。
SQL> call dbms_stats.gather_table_stats(user, 'JXC_RISHARESUM');
.....
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1738 | 48664 | 82 (3)| 00:00:01 |
| 1 | VIEW | | 1738 | 48664 | 82 (3)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | FILTER | | | | | |
|* 4 | HASH JOIN OUTER | | 1735 | 191K| 41 (3)| 00:00:01 |
| 5 | TABLE ACCESS FULL| JXC_RISHARESUM | 1735 | 98895 | 7 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL| JXC_ALLTRADEDAY | 12775 | 698K| 33 (0)| 00:00:01 |
|* 7 | HASH JOIN ANTI | | 3 | 276 | 41 (3)| 00:00:01 |
|* 8 | TABLE ACCESS FULL | JXC_ALLTRADEDAY | 3 | 168 | 33 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | JXC_RISHARESUM | 1735 | 62460 | 7 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(NVL("T1"."C_SHARENO","T2"."C_SHARENO")='26200703200004969020')
4 - access("T2"."D_TRADEDATE"(+)="T1"."D_TRADEDATE" AND
"T2"."D_REGDATE"(+)="T1"."D_REGDATE" AND "T2"."C_FUNDCODE"(+)="T1"."C_FUNDCODE"
AND "T2"."C_FUNDACCO"(+)="T1"."C_FUNDACCO")
7 - access("T2"."C_FUNDACCO"="T1"."C_FUNDACCO" AND
"T2"."C_FUNDCODE"="T1"."C_FUNDCODE" AND "T2"."D_REGDATE"="T1"."D_REGDATE" AND
"T2"."D_TRADEDATE"="T1"."D_TRADEDATE")
8 - filter("T2"."C_SHARENO"='26200703200004969020')
可发��,这时原来连接方式的错误执行计划被修正了,改为hash join outer连接。