Oracle 18C新特性:内联外部表

介绍

Oracle 数据库18c 允许您使用 SELECT 语句中定义的内联外部表访问平面文件中的数据。

什么是内联外部表?

内联外部表将外部表的定义直接放在SQL语句中,不需要额外在数据字典中创建外部表对象。当使用内联外部表的时候,与创建外部表(CREATE TABLE)相同的语法可以用在一个select语句上。可以在一个字句的 from 关键字后面指定内联外部表。含有内联外部表的查询也可以包含常规的表的关联(joins),聚合(aggregation)等等。

这比 Oracle 数据库12c 版本2(12.2)中引入的查询中重写外部表参数的能力更进一步。

实践证明

1、生成测试数据

为了演示内联外部表,我们需要一些平面文件中的数据。 下面的代码生成4个 CSV 文件,每个文件有1000行。

[:/home/oracle]$more test.sql  <<=============创建脚本,生成数据
ALTER SESSION SET CONTAINER = pdb1;
set echo off
set heading off
set term off
SET MARKUP CSV ON QUOTE ON
SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK OFF PAGESIZE 0
SPOOL /tmp/gbr1.txt
SELECT ‘GBR‘,
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id <= 2000
AND    rownum <= 1000;
SPOOL OFF

SPOOL /tmp/gbr2.txt
SELECT ‘GBR‘,
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id BETWEEN 2000 AND 3999
AND    rownum <= 1000;
SPOOL OFF

SPOOL /tmp/ire1.txt
SELECT ‘IRE‘,
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id <= 2000
AND    rownum <= 1000;
SPOOL OFF

SPOOL /tmp/ire2.txt
SELECT ‘IRE‘,
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id BETWEEN 2000 AND 3999
AND    rownum <= 1000;
SPOOL OFF
SET MARKUP CSV OFF
SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK OFF PAGESIZE 14

[:/home/oracle]$ss

SQL> @test                                       <=============执行脚本

Session altered.

[:/home/oracle]$ll /tmp/*.txt      <=============查看生成的数据
-rw-r--r-- 1 oracle oinstall 33529 3月   5 11:40 /tmp/gbr1.txt
-rw-r--r-- 1 oracle oinstall 38554 3月   5 11:40 /tmp/gbr2.txt
-rw-r--r-- 1 oracle oinstall 33529 3月   5 11:40 /tmp/ire1.txt
-rw-r--r-- 1 oracle oinstall 38554 3月   5 11:40 /tmp/ire2.txt

创建两个目录对象来访问这些文件。 在这种情况下,两者都指向同一个目录。

[:/home/oracle]$ss

SQL> ALTER SESSION SET CONTAINER = pdb1;

Session altered.

SQL> CREATE OR REPLACE DIRECTORY tmp_dir1 AS ‘/tmp/‘;

Directory created.

SQL> GRANT READ, WRITE ON DIRECTORY tmp_dir1 TO test;

Grant succeeded.

CREATE OR REPLACE DIRECTORY tmp_dir2 AS ‘/tmp/‘;

Directory created.

SQL> GRANT READ, WRITE ON DIRECTORY tmp_dir2 TO test;

Grant succeeded.

2、内联外部表

内联外部表允许使用 EXTERNAL 子句将外部表定义放在 SQL 语句的 FROM 子句中,因此不需要显式创建外部表。

SELECT country_code, COUNT(*) AS amount
FROM   EXTERNAL (
         (
           country_code  VARCHAR2(3),
           object_id     NUMBER,
           owner         VARCHAR2(128),
           object_name   VARCHAR2(128)
         )
         TYPE oracle_loader
         DEFAULT DIRECTORY tmp_dir2
         ACCESS PARAMETERS (
           RECORDS DELIMITED BY NEWLINE
           BADFILE tmp_dir2
           LOGFILE tmp_dir2:‘inline_ext_tab_%a_%p.log‘
           DISCARDFILE tmp_dir2
           FIELDS CSV WITH EMBEDDED TERMINATED BY ‘,‘ OPTIONALLY ENCLOSED BY ‘"‘
           MISSING FIELD VALUES ARE NULL (
             country_code,
             object_id,
             owner,
             object_name 
           )
        )
        LOCATION (‘gbr1.txt‘, ‘gbr2.txt‘)
        REJECT LIMIT UNLIMITED
      ) inline_ext_tab
GROUP BY country_code
ORDER BY 1;
COUNTRY_C     AMOUNT
--------- ----------
GBR             2000
IRE             2000
SQL>

在下面的示例中,我们使用不同的目录对象,并在 LOCATION 子句中指定不同的文件列表。 毫不奇怪,这给了我们一个不同的结果。

SELECT country_code, COUNT(*) AS amount
FROM   EXTERNAL (
         (
           country_code  VARCHAR2(3),
           object_id     NUMBER,
           owner         VARCHAR2(128),
           object_name   VARCHAR2(128)
         )
         TYPE oracle_loader
         DEFAULT DIRECTORY tmp_dir2
         ACCESS PARAMETERS (
           RECORDS DELIMITED BY NEWLINE
           BADFILE tmp_dir2
           LOGFILE tmp_dir2:‘inline_ext_tab_%a_%p.log‘
           DISCARDFILE tmp_dir2
           FIELDS CSV WITH EMBEDDED TERMINATED BY ‘,‘ OPTIONALLY ENCLOSED BY ‘"‘
           MISSING FIELD VALUES ARE NULL (
             country_code,
             object_id,
             owner,
             object_name 
           )
        )
        LOCATION (‘gbr1.txt‘, ‘gbr2.txt‘)
        REJECT LIMIT UNLIMITED
      ) inline_ext_tab
GROUP BY country_code
ORDER BY 1;

COUNTRY_C     AMOUNT
--------- ----------
GBR             2000

内联外部表定义有点难看,因此,如果计划将其连接到其他表,可能更喜欢将其放入 WITH 子句中。

WITH inline_ext_tab AS (
  SELECT *
  FROM   EXTERNAL (
           (
             country_code  VARCHAR2(3),
             object_id     NUMBER,
             owner         VARCHAR2(128),
             object_name   VARCHAR2(128)
           )
           TYPE oracle_loader
           DEFAULT DIRECTORY tmp_dir2
           ACCESS PARAMETERS (
             RECORDS DELIMITED BY NEWLINE
             BADFILE tmp_dir2
             LOGFILE tmp_dir2:‘inline_ext_tab_%a_%p.log‘
             DISCARDFILE tmp_dir2
             FIELDS CSV WITH EMBEDDED TERMINATED BY ‘,‘ OPTIONALLY ENCLOSED BY ‘"‘
             MISSING FIELD VALUES ARE NULL (
               country_code,
               object_id,
               owner,
               object_name 
             )
          )
          LOCATION (‘gbr1.txt‘, ‘gbr2.txt‘)
          REJECT LIMIT UNLIMITED
        )
)
SELECT country_code, COUNT(*) AS amount
FROM   inline_ext_tab 
GROUP BY country_code
ORDER BY 1;

COUNTRY_C     AMOUNT
--------- ----------
GBR             2000

3、对安全的影响

无论是否使用外部表,对目录对象的访问都应该受到严格控制。

  4、其他

1) 正如文档中指出的,这个功能不支持分区的外部表,但是这是不相关的,因为您可以完全控制所访问的文件,所以不需要考虑分区

2)它导致了非常难看的 SQL

3)在没有特权创建元数据对象(例如只读数据库)的情况下,它可能很有用

4)如上所述,需要考虑与此功能相关的安全因素

相关推荐