Exporting a CLOB

I’m in the final phase of a clear down process to remove data that is no longer legally required from a production database. Unfortunately for me this involves writing the data directly to disc in csv format for reference later, not my preferred option for this piece of work but it pays the bills – so who am I to complain? Anyway the process has been relatively painless up to now – I’m hitting tables with clobs and large ones at that – ranging from 10,000 characters up to just over a 1,000,000 in length. How the devil do I write these to a CSV?

From the start of this piece of work the best method appeared to be to export the data in a format file that would make it easy to use as reference via an external table due to a potential requirement to have to audit from the data even though the legal retention requirement for the data had passed.
To do this I have written a three procedures and a driving table. The driving table stores the table names, sql conditions for the tables and processing information for the run, the first procedure generates the external table build command for the table and the second generates a SQL script to be used to export the data and calls the first procedure to generate the external table script and the third runs the delete of the data. The export script is basically the same format for each table:-

1. CREATE TABLE as SELECT * FROM table.
2. Format all table columns for fixed length field format.
3. Call procedure to generate external table.
4. Spool data as CSV by replacing the column separator with a comma

Below is example output from the export data script:-

CREATE TABLE exp_tab_name AS
  SELECT *
  FROM tab_name
  WHERE ;

COLUMN  FORMAT a20
COLUMN  FORMAT 999999999
 
SET PAGES 0
SET LINES 30
SET COLSEP ','
SPOOL tab_name.desc
EXEC proc_GenExtTableFile("", "TABLE_NAME");
SPOOL off 
SELECT * FROM exp_tab_name

SPOOL taba.out
/
SPOOL off

For none clob tables this is working like a dream, but for a million+ characters potentially this wasn’t going to work. Well after a little reading I found that oracle allows you to reference a clob as an actual file, which meant that I no longer has to worry about multline clobs ruining my csv, but I still had the issue of getting the clob into the file. The next step seemed obvious but painful – I was going to have to use UTL_FILE to output clob. This would mean that I would need to update the generate script to routine to name all columns individually and write a new routine to output the clob. I decided to implement this a function that writes the clob to a file and outputs the clob file location. Lets generate some test data and see

create table tbl_testclob
(idcol number
,clobcol clob
)
;

DECLARE
bigtext   clob;
BEGIN

  EXECUTE IMMEDIATE 'truncate table tbl_testclob';

  FOR i IN 1 .. 11 LOOP
    bigtext := dbms_random.string('A',4000);
    INSERT INTO tbl_testclob VALUES (i, bigtext);
    
    FOR j in 1 .. i LOOP
      IF i = j THEN
        null;
      ELSE
        UPDATE tbl_testclob set clobcol = clobcol || clobcol where idcol=i;
      END IF;
    END LOOP;
  
  END LOOP;
  commit;

END;
/

SELECT idcol,dbms_lob.getlength(clobcol)
FROM tbl_testclob;

     IDCOL DBMS_LOB.GETLENGTH(CLOBCOL)
---------- ---------------------------
         1                        4000
         2                        8000
         3                       16000
         4                       32000
         5                       64000
         6                      128000
         7                      256000
         8                      512000
         9                     1024000
        10                     2048000
        11                     4096000

You should see the same output as above. Now lets try to output the data via utl_file. The first step is to create a directory to output the data too.

CREATE DIRECTORY test_dir AS '/home/oracle/extdata/test';

Now lets create the procedure to output the clob using utl file

CREATE or REPLACE PROCEDURE proc_utlclob
AS

CURSOR cur_tab
IS SELECT * FROM tbl_testclob;

lv_clobsize number(8);
lv_clobPos number(8);
csvHandler  UTL_FILE.FILE_TYPE;

BEGIN
csvHandler := UTL_FILE.FOPEN('TEST_DIR', 'utlclob.csv', 'W',32001);
      UTL_FILE.FCLOSE(csvHandler); 

  -- Open Cursor for tbl_testclob
  FOR c in cur_tab LOOP
    -- Get clob size
    lv_clobsize:= dbms_lob.getLength(c.clobcol);
    -- reset clob position
    lv_ClobPos:=1;
    -- open csv file Write Mode
    csvHandler := UTL_FILE.FOPEN('TEST_DIR', 'utlclob_'||to_char(c.idcol)||'.csv', 'W',32001);

    WHILE lv_ClobPos < lv_clobsize LOOP
      -- Write 32000 char to file
      UTL_FILE.put_line(csvHandler, dbms_lob.substr(c.clobcol,32000,lv_ClobPos));
      utl_file.fflush(csvHandler);
      -- Move column position
      lv_ClobPos := lv_ClobPos + 32000;
    END LOOP;
      -- Close file
      UTL_FILE.FCLOSE(csvHandler); 

  END LOOP;

END proc_utlclob;
/

Lets execute the procedure and see what look at the results:-

o64-12c-n01:oracle@PROD12C1 > ls –ltr /home/oracle/extdata/test
total 16024
-rw-r--r-- 1 oracle asmadmin    4001 Oct 10 13:28 utlclob_1.csv
-rw-r--r-- 1 oracle asmadmin    8001 Oct 10 13:28 utlclob_2.csv
-rw-r--r-- 1 oracle asmadmin   16001 Oct 10 13:28 utlclob_3.csv
-rw-r--r-- 1 oracle asmadmin   32001 Oct 10 13:28 utlclob_4.csv
-rw-r--r-- 1 oracle asmadmin   64002 Oct 10 13:28 utlclob_5.csv
-rw-r--r-- 1 oracle asmadmin  128004 Oct 10 13:28 utlclob_6.csv
-rw-r--r-- 1 oracle asmadmin  256008 Oct 10 13:28 utlclob_7.csv
-rw-r--r-- 1 oracle asmadmin  512016 Oct 10 13:28 utlclob_8.csv
-rw-r--r-- 1 oracle asmadmin 1024032 Oct 10 13:28 utlclob_9.csv
-rw-r--r-- 1 oracle asmadmin 2048064 Oct 10 13:28 utlclob_10.csv
-rw-r--r-- 1 oracle asmadmin 4096128 Oct 10 13:28 utlclob_11.csv

Not quite what I was expecting. Looking at the files it appears that utl_file is adding a newline character to the end of each line added as each line contains the length of the string/32000 lines in it. I played around with this for some time and was unable to find the answer – If you know please inform me.

So although I was getting the data out and the client wasn’t really all that bothered by the extra lines I was unhappy with the end result so hit the Google hard and strayed upon the following little gem dbms_xslprocessor. The little beauty has an in built procedure called clob2file which conveniently outputs a clob to a file!

SQL> desc dbms_xslprocessor
PROCEDURE CLOB2FILE
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 CL                             CLOB                    IN
 FLOCATION                      VARCHAR2                IN
 FNAME                          VARCHAR2                IN
 CSID                           NUMBER                  IN     DEFAULT

Well blow me oracle hits the jackpot again! Lets drop it into a procedure and output out test clobs.

CREATE or REPLACE PROCEDURE proc_xmlclob
AS

CURSOR cur_tab
IS SELECT * FROM tbl_testclob;

BEGIN

  -- Open Cursor for tbl_testclob
  FOR c in cur_tab LOOP
    dbms_xslprocessor.clob2file(c.clobcol, 'TEST_DIR', 'xmlclob_'||to_char(c.idcol)||'.csv');
  END LOOP;

END proc_xmlclob;
/

Lets run and compare

o64-12c-n01:oracle@PROD12C1 > ls –ltr /home/oracle/extdata/test
total 16016
-rw-r--r-- 1 oracle asmadmin    4001 Oct 10 13:28 utlclob_1.csv
-rw-r--r-- 1 oracle asmadmin    8001 Oct 10 13:28 utlclob_2.csv
-rw-r--r-- 1 oracle asmadmin   16001 Oct 10 13:28 utlclob_3.csv
-rw-r--r-- 1 oracle asmadmin   32001 Oct 10 13:28 utlclob_4.csv
-rw-r--r-- 1 oracle asmadmin   64002 Oct 10 13:28 utlclob_5.csv
-rw-r--r-- 1 oracle asmadmin  128004 Oct 10 13:28 utlclob_6.csv
-rw-r--r-- 1 oracle asmadmin  256008 Oct 10 13:28 utlclob_7.csv
-rw-r--r-- 1 oracle asmadmin  512016 Oct 10 13:28 utlclob_8.csv
-rw-r--r-- 1 oracle asmadmin 1024032 Oct 10 13:28 utlclob_9.csv
-rw-r--r-- 1 oracle asmadmin 2048064 Oct 10 13:28 utlclob_10.csv
-rw-r--r-- 1 oracle asmadmin 4096128 Oct 10 13:28 utlclob_11.csv
-rw-r--r-- 1 oracle asmadmin    4000 Oct 10 13:28 xmlclob_1.csv
-rw-r--r-- 1 oracle asmadmin    8000 Oct 10 13:28 xmlclob_2.csv
-rw-r--r-- 1 oracle asmadmin   16000 Oct 10 13:28 xmlclob_3.csv
-rw-r--r-- 1 oracle asmadmin   32000 Oct 10 13:28 xmlclob_4.csv
-rw-r--r-- 1 oracle asmadmin   64000 Oct 10 13:28 xmlclob_5.csv
-rw-r--r-- 1 oracle asmadmin  128000 Oct 10 13:28 xmlclob_6.csv
-rw-r--r-- 1 oracle asmadmin  256000 Oct 10 13:28 xmlclob_7.csv
-rw-r--r-- 1 oracle asmadmin  512000 Oct 10 13:28 xmlclob_8.csv
-rw-r--r-- 1 oracle asmadmin 1024000 Oct 10 13:28 xmlclob_9.csv
-rw-r--r-- 1 oracle asmadmin 2048000 Oct 10 13:28 xmlclob_10.csv
-rw-r--r-- 1 oracle asmadmin 4096000 Oct 10 13:28 xmlclob_11.csv

So we can see that xslprocessor has produced the clobs as files with the exact bytes count s character count for single line clobs, but what about multi line one. Below is a data generator.

DECLARE
bigtext   clob;
BEGIN

  EXECUTE IMMEDIATE 'truncate table tbl_testclob';

  FOR i IN 1 .. 5 LOOP
    bigtext := dbms_random.string('A',4000);
    INSERT INTO tbl_testclob VALUES (i, bigtext);
    
    FOR j in 1 .. i LOOP
      IF i = j THEN
        null;
      ELSE
        UPDATE tbl_testclob set clobcol = clobcol ||chr(10)|| clobcol where idcol=i;
      END IF;
    END LOOP;
  
  END LOOP;
  commit;

END;
/

SELECT idcol,dbms_lob.getlength(clobcol)
FROM tbl_testclob;
  
     IDCOL DBMS_LOB.GETLENGTH(CLOBCOL)
---------- ---------------------------
         1                        4000
         2                        8001
         3                       16003
         4                       32007
         5                       64015

Remove the old testing files and run both routines again

o64-12c-n01:oracle@PROD12C1 > ls -ltr
total 248
-rw-r--r-- 1 oracle asmadmin  4001 Oct 10 13:37 utlclob_1.csv
-rw-r--r-- 1 oracle asmadmin  8002 Oct 10 13:37 utlclob_2.csv
-rw-r--r-- 1 oracle asmadmin 16004 Oct 10 13:37 utlclob_3.csv
-rw-r--r-- 1 oracle asmadmin 32009 Oct 10 13:37 utlclob_4.csv
-rw-r--r-- 1 oracle asmadmin 64018 Oct 10 13:37 utlclob_5.csv
-rw-r--r-- 1 oracle asmadmin  4000 Oct 10 13:38 xmlclob_1.csv
-rw-r--r-- 1 oracle asmadmin  8001 Oct 10 13:38 xmlclob_2.csv
-rw-r--r-- 1 oracle asmadmin 16003 Oct 10 13:38 xmlclob_3.csv
-rw-r--r-- 1 oracle asmadmin 32007 Oct 10 13:38 xmlclob_4.csv
-rw-r--r-- 1 oracle asmadmin 64015 Oct 10 13:38 xmlclob_5.csv

Again the xsl files are the exact byte size of the relevant clob, but the utl files contain extra bytes/characters from the utl_file export routine.

Advertisements

I am a DBA with over 10 years experience specifing, installing, configuring, managing and trouble shooting Oracle Databases. I found my first grey hairs last week and started to have a mini mid life crisis. So in an effort not to feel so old i thought "what do all the cool kids do?" and after dabbling with onezies, thinking tattoos would hurt too much i arrived at the thought that would start a blog about the interesting black art that is being an oracle dba. So here it is, myoracledbablog.wordpress.com - My little corner of the internet. Why don't you come in, pull up a pew and stay a while - You never know you might leave a little better off than when you arrived!

Tagged with: , , , ,
Posted in DBA Admin
One comment on “Exporting a CLOB

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: