If we want to process huge tables (few hunderds of gigabytes), it will take long time for a single SQLULDR2 or DataCopy process, single process can makde single CPU exhausted. To speedup the process, we need maually partitioned the huge tables, and start multiple SQLULDR2 or DataCopy processes. But manuall partition is too complex for us, we need automate the partition method.
The key is how to split huge tables into multiple pieces, if we split it manually, then we can only perform manually parallel process, if we can split it automatically, then we can perform automatically parallel process. I wrote a program (TabSplit) to handle the split work by rowid range.
Usage: tabsplit user=... owner=... table=... [part=...] [degree=...]
There is a big table (EMP_HIS) on my local database, I need to unload it text file with parallel degree 4 to save time. Let's split it with TabSplit utility.
D:\>tabsplit owner=scott table=emp_his degree=4
ROWID>='AAADAUAAEAAAE0xAAA' AND ROWID < 'AAADAUAAEAAAFUIEAA'
ROWID>='AAADAUAAEAAAFUJAAA' AND ROWID < 'AAADAUAAEAAAHEIEAA'
ROWID>='AAADAUAAEAAAHEJAAA' AND ROWID < 'AAADAUAAEAAAHiIEAA'
ROWID>='AAADAUAAEAAAHiJAAA' AND ROWID < 'AAADAUAAEAAAH8IEAA'
Now we will check the record count of each piece, and then summarize the total record count.
SQL> select count(*) from scott.emp_his where
2 ROWID>='AAADAUAAEAAAE0xAAA' AND ROWID < 'AAADAUAAEAAAFUIEAA';
307200
SQL> select count(*) from scott.emp_his where
2 ROWID>='AAADAUAAEAAAFUJAAA' AND ROWID < 'AAADAUAAEAAAHEIEAA';
309421
SQL> select count(*) from scott.emp_his where
2 ROWID>='AAADAUAAEAAAHEJAAA' AND ROWID < 'AAADAUAAEAAAHiIEAA';
309606
SQL> select count(*) from scott.emp_his where
2 ROWID>='AAADAUAAEAAAHiJAAA' AND ROWID < 'AAADAUAAEAAAH8IEAA';
253469
SQL> select 307200+309421+309606+253469 from DUAL;
1179696
And we query the record count by a full table scan in Oracle.
SQL> select count(*) from scott.emp_his;
1179696
We found that the record count matched, it's good news for us. Now we can write packer scripts to start multiple SQLULDR2 or DataCopy processes quickly for huge tables.
