Parallel execution of SQLULDR2 or DataCopy for huge tables

Links: http://www.dbatools.net/mytools/parallel-sqluldr2-datacopy.html

    If we want to process huge tables (few hunderds of gigabytes), it will take long time for a single SQLULDR2 or DataCopy process, single process can makde single CPU exhausted. To speedup the process, we need maually partitioned the huge tables, and start multiple SQLULDR2 or DataCopy processes. But manuall partition is too complex for us, we need automate the partition method.

    The key is how to split huge tables into multiple pieces, if we split it manually, then we can only perform manually parallel process, if we can split it automatically, then we can perform automatically parallel process. I wrote a program (TabSplit) to handle the split work by rowid range.

Usage: tabsplit user=... owner=... table=... [part=...] [degree=...]

    There is a big table (EMP_HIS) on my local database, I need to unload it text file with parallel degree 4 to save time. Let's split it with TabSplit utility.

D:\>tabsplit owner=scott table=emp_his degree=4
ROWID>='AAADAUAAEAAAE0xAAA' AND ROWID < 'AAADAUAAEAAAFUIEAA'
ROWID>='AAADAUAAEAAAFUJAAA' AND ROWID < 'AAADAUAAEAAAHEIEAA'
ROWID>='AAADAUAAEAAAHEJAAA' AND ROWID < 'AAADAUAAEAAAHiIEAA'
ROWID>='AAADAUAAEAAAHiJAAA' AND ROWID < 'AAADAUAAEAAAH8IEAA'

    Now we will check the record count of each piece, and then summarize the total record count.

SQL> select count(*) from scott.emp_his where
  2  ROWID>='AAADAUAAEAAAE0xAAA' AND ROWID < 'AAADAUAAEAAAFUIEAA';

    307200

SQL> select count(*) from scott.emp_his where
  2  ROWID>='AAADAUAAEAAAFUJAAA' AND ROWID < 'AAADAUAAEAAAHEIEAA';

    309421

SQL> select count(*) from scott.emp_his where
  2  ROWID>='AAADAUAAEAAAHEJAAA' AND ROWID < 'AAADAUAAEAAAHiIEAA';

    309606

SQL> select count(*) from scott.emp_his where
  2  ROWID>='AAADAUAAEAAAHiJAAA' AND ROWID < 'AAADAUAAEAAAH8IEAA';

    253469

SQL> select 307200+309421+309606+253469 from DUAL;

                    1179696

    And we query the record count by a full table scan in Oracle.

SQL> select count(*) from scott.emp_his;

  1179696

    We found that the record count matched, it's good news for us. Now we can write packer scripts to start multiple SQLULDR2 or DataCopy processes quickly for huge tables.

Post a comment

Remember Me?

« Previous | Main | Next »

Powered by
Movable Type 5.01