Oracle 18C新特性:内联外部表
介绍
Oracle 数据库18c 允许您使用 SELECT 语句中定义的内联外部表访问平面文件中的数据。
什么是内联外部表?
内联外部表将外部表的定义直接放在SQL语句中,不需要额外在数据字典中创建外部表对象。当使用内联外部表的时候,与创建外部表(CREATE TABLE)相同的语法可以用在一个select语句上。可以在一个字句的 from 关键字后面指定内联外部表。含有内联外部表的查询也可以包含常规的表的关联(joins),聚合(aggregation)等等。
这比 Oracle 数据库12c 版本2(12.2)中引入的查询中重写外部表参数的能力更进一步。
实践证明
1、生成测试数据
为了演示内联外部表,我们需要一些平面文件中的数据。 下面的代码生成4个 CSV 文件,每个文件有1000行。
[:/home/oracle]$more test.sql <<=============创建脚本,生成数据 ALTER SESSION SET CONTAINER = pdb1; set echo off set heading off set term off SET MARKUP CSV ON QUOTE ON SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK OFF PAGESIZE 0 SPOOL /tmp/gbr1.txt SELECT ‘GBR‘, object_id, owner, object_name FROM dba_objects WHERE object_id <= 2000 AND rownum <= 1000; SPOOL OFF SPOOL /tmp/gbr2.txt SELECT ‘GBR‘, object_id, owner, object_name FROM dba_objects WHERE object_id BETWEEN 2000 AND 3999 AND rownum <= 1000; SPOOL OFF SPOOL /tmp/ire1.txt SELECT ‘IRE‘, object_id, owner, object_name FROM dba_objects WHERE object_id <= 2000 AND rownum <= 1000; SPOOL OFF SPOOL /tmp/ire2.txt SELECT ‘IRE‘, object_id, owner, object_name FROM dba_objects WHERE object_id BETWEEN 2000 AND 3999 AND rownum <= 1000; SPOOL OFF SET MARKUP CSV OFF SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK OFF PAGESIZE 14 [:/home/oracle]$ss SQL> @test <=============执行脚本 Session altered. [:/home/oracle]$ll /tmp/*.txt <=============查看生成的数据 -rw-r--r-- 1 oracle oinstall 33529 3月 5 11:40 /tmp/gbr1.txt -rw-r--r-- 1 oracle oinstall 38554 3月 5 11:40 /tmp/gbr2.txt -rw-r--r-- 1 oracle oinstall 33529 3月 5 11:40 /tmp/ire1.txt -rw-r--r-- 1 oracle oinstall 38554 3月 5 11:40 /tmp/ire2.txt
创建两个目录对象来访问这些文件。 在这种情况下,两者都指向同一个目录。
[:/home/oracle]$ss SQL> ALTER SESSION SET CONTAINER = pdb1; Session altered. SQL> CREATE OR REPLACE DIRECTORY tmp_dir1 AS ‘/tmp/‘; Directory created. SQL> GRANT READ, WRITE ON DIRECTORY tmp_dir1 TO test; Grant succeeded. CREATE OR REPLACE DIRECTORY tmp_dir2 AS ‘/tmp/‘; Directory created. SQL> GRANT READ, WRITE ON DIRECTORY tmp_dir2 TO test; Grant succeeded.
2、内联外部表
内联外部表允许使用 EXTERNAL 子句将外部表定义放在 SQL 语句的 FROM 子句中,因此不需要显式创建外部表。
SELECT country_code, COUNT(*) AS amount FROM EXTERNAL ( ( country_code VARCHAR2(3), object_id NUMBER, owner VARCHAR2(128), object_name VARCHAR2(128) ) TYPE oracle_loader DEFAULT DIRECTORY tmp_dir2 ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE BADFILE tmp_dir2 LOGFILE tmp_dir2:‘inline_ext_tab_%a_%p.log‘ DISCARDFILE tmp_dir2 FIELDS CSV WITH EMBEDDED TERMINATED BY ‘,‘ OPTIONALLY ENCLOSED BY ‘"‘ MISSING FIELD VALUES ARE NULL ( country_code, object_id, owner, object_name ) ) LOCATION (‘gbr1.txt‘, ‘gbr2.txt‘) REJECT LIMIT UNLIMITED ) inline_ext_tab GROUP BY country_code ORDER BY 1; COUNTRY_C AMOUNT --------- ---------- GBR 2000 IRE 2000 SQL>
在下面的示例中,我们使用不同的目录对象,并在 LOCATION 子句中指定不同的文件列表。 毫不奇怪,这给了我们一个不同的结果。
SELECT country_code, COUNT(*) AS amount FROM EXTERNAL ( ( country_code VARCHAR2(3), object_id NUMBER, owner VARCHAR2(128), object_name VARCHAR2(128) ) TYPE oracle_loader DEFAULT DIRECTORY tmp_dir2 ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE BADFILE tmp_dir2 LOGFILE tmp_dir2:‘inline_ext_tab_%a_%p.log‘ DISCARDFILE tmp_dir2 FIELDS CSV WITH EMBEDDED TERMINATED BY ‘,‘ OPTIONALLY ENCLOSED BY ‘"‘ MISSING FIELD VALUES ARE NULL ( country_code, object_id, owner, object_name ) ) LOCATION (‘gbr1.txt‘, ‘gbr2.txt‘) REJECT LIMIT UNLIMITED ) inline_ext_tab GROUP BY country_code ORDER BY 1; COUNTRY_C AMOUNT --------- ---------- GBR 2000
内联外部表定义有点难看,因此,如果计划将其连接到其他表,可能更喜欢将其放入 WITH 子句中。
WITH inline_ext_tab AS ( SELECT * FROM EXTERNAL ( ( country_code VARCHAR2(3), object_id NUMBER, owner VARCHAR2(128), object_name VARCHAR2(128) ) TYPE oracle_loader DEFAULT DIRECTORY tmp_dir2 ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE BADFILE tmp_dir2 LOGFILE tmp_dir2:‘inline_ext_tab_%a_%p.log‘ DISCARDFILE tmp_dir2 FIELDS CSV WITH EMBEDDED TERMINATED BY ‘,‘ OPTIONALLY ENCLOSED BY ‘"‘ MISSING FIELD VALUES ARE NULL ( country_code, object_id, owner, object_name ) ) LOCATION (‘gbr1.txt‘, ‘gbr2.txt‘) REJECT LIMIT UNLIMITED ) ) SELECT country_code, COUNT(*) AS amount FROM inline_ext_tab GROUP BY country_code ORDER BY 1; COUNTRY_C AMOUNT --------- ---------- GBR 2000
3、对安全的影响
无论是否使用外部表,对目录对象的访问都应该受到严格控制。
4、其他
1) 正如文档中指出的,这个功能不支持分区的外部表,但是这是不相关的,因为您可以完全控制所访问的文件,所以不需要考虑分区
2)它导致了非常难看的 SQL
3)在没有特权创建元数据对象(例如只读数据库)的情况下,它可能很有用
4)如上所述,需要考虑与此功能相关的安全因素