Sunday, April 21, 2013

DB2 Export and Load Using Compression & Named Pipes

Have you ever tried to export a very large table from your DB2 database and the dump was so huge that filled up the filesystem? Then you dug the DB2 Infocenter just to find there's no such thing as a compress option for the db2 export command. Damn it!
Well, everything is not lost, you should try a pearl called named pipe.

Instead of a conventional, unnamed, shell pipeline, a named pipeline makes use of the filesystem. It is explicitly created using mkfifo, and two separate processes can access the pipe by name: one process can open it as a reader, and the other as a writer.

For example, one can create a pipe and set up gzip to compress things piped to it:

mkfifo /tmp/mypipe 
gzip -c < /tmp/mypipe > mytable.del.gz & 

In a separate process, independently, one could send the data to be compressed:

db2 "export to /tmp/mypipe of del select * from myschema.mytable" 

The dump will be compressed by the background process you created and directed to the compressed file mytable.del.gz. Later the named pipe can be deleted just like any file:

rm /tmp/mypipe

The other way around is also possible, you can pipe the output of gzip into a named pipe like so:

mkfifo /tmp/mypipe 
cat employee.del.gz | gzip -dc > /tmp/mypipe & 

Then load the uncompressed data into your DB2 table:

db2 "load from /tmp/mypipe of del replace into myschema.mytable nonrecoverable" 

Remove the pipe after the load is completed:

rm /tmp/mypipe

Without this named pipe one would need to write out the entire uncompressed version of mytable.del.gz before loading it into DB2. Writing the temporary file is both time consuming and results in more I/O and less free space on the hard drive. I found it really useful for big tables, who's with me??