Optimizing BLOBs for SQL*Loader

All posts relating to Oracle database administration.

Moderator: Tim...

Locked
username391
Member
Posts: 3
Joined: Wed Sep 21, 2005 4:41 pm
Location: Villingen/Germany

Optimizing BLOBs for SQL*Loader

Post by username391 » Thu Sep 22, 2005 7:15 am

Hi,

I have to load a mass of data (some million records) on a regular basis into Oracle 10g release 2. While I am sure that sqlldr can achieve the needed performance for the "text" part of the data I am unsure about how to optimally organize the binary data (pictures) that is associated.
Currently I am using the control file option

Code: Select all

ext_fname FILLER CHAR(40),
"DATA" LOBFILE(ext_fname) TERMINATED BY EOF
What I would prefer (because it would avoid opening millions of small picture files) is having all pictures within the text data files or at least having multiple pictures (with preceding byte length) in one file.

Can sqlldr be used like that? Or is there an even better way?

Best regards from sunny Black Forrest,
Mattin

Tim...
Site Admin
Posts: 18437
Joined: Mon Nov 01, 2004 5:56 pm
Location: England, UK
Contact:

Post by Tim... » Thu Sep 22, 2005 8:15 am

Hi.

In truth my answer to this is I don't know. I've loaded pictures into databases on a number of projects, but always on a one-by-one basis using interMedia or plain BLOBs. The closest I've come to a mass load is a drag an drop into an IFS filesystem.

I'm going to put a post on my blog about this and see if any of the readers there have any suggestions.

Here are some articles relating to loading images into the database, but none of them are concerned with mass loads:

https://oracle-base.com/articles/8i/import-blob.php
https://oracle-base.com/articles/8i/export-blob.php
https://oracle-base.com/articles/9i/export-blob-9i.php
https://oracle-base.com/articles/8i/int ... images.php
https://oracle-base.com/articles/9i/int ... ges-9i.php
https://oracle-base.com/articles/9i/ima ... ttp-9i.php
https://oracle-base.com/articles/9i/fil ... res-9i.php

Cheers

Tim...
Tim...
About Me

username391
Member
Posts: 3
Joined: Wed Sep 21, 2005 4:41 pm
Location: Villingen/Germany

Post by username391 » Thu Sep 22, 2005 9:10 am

Hi Tim,
thanks for your answer.

Maybe those articles will be needed if I figure out that loading single image files is really too slow. Because then it might become an option to use a BFILE (which should be much faster on import) and update the BLOB on first request or even at night when the system will have CPU cycles for background tasks.

Currently I am still lacking real masses of data to test such things, therefor I can only speculate...

Best regards from the sunny Black Forrest, Germany,
Mattin

Tim...
Site Admin
Posts: 18437
Joined: Mon Nov 01, 2004 5:56 pm
Location: England, UK
Contact:

Post by Tim... » Thu Sep 22, 2005 9:14 am

Hi.

I've posted a reference to this question on my blog:

http://oracle-base.blogspot.com/2005/09 ... stion.html

Hopefully someone will be able to give some more insight...

Cheers

Tim...
Tim...
About Me

username391
Member
Posts: 3
Joined: Wed Sep 21, 2005 4:41 pm
Location: Villingen/Germany

Solution :D

Post by username391 » Thu Oct 13, 2005 3:37 pm

Hello,

figured out the solution myself - probably noone answered because it was too easy? ;-) I thought I should post it here, because you were very friendly, Tim! Thanks for your help and keep up this good forum!

:arrow: Solution:
Simply use the following control file syntax:

Code: Select all

...
  data LOBFILE(CONSTANT 'abc/vd-img.lob') VARRAWC(4,4096)
...
The file 'abc/vd-img.lob then needs to contain a (as specified by the first number in the varraw datatype) four-digit number specifying the length of the following image. Then the image follows (which, according to the control file, may not be longer than 4096 bytes). Then the size of the next image follows, followed, by the image, ... (and so on).

With a test on a Solaris9 two-processor machine removing the I/O overhead by putting 22000 image files into the single lob file I saved approx. 20% of load time. This increases even more if starting multiple loads in parallel. (I know, this depends very much on the used disks, the size of the images, ... and lots of other factors, but those 20% are already very nice).

Instead of using a constant file containing all the images it is (according to the documentation) also possible to specify the file in each row of the input data.

Best regards from sunny Black Forrest, Germany,
Mattin

Tim...
Site Admin
Posts: 18437
Joined: Mon Nov 01, 2004 5:56 pm
Location: England, UK
Contact:

Post by Tim... » Thu Oct 13, 2005 4:07 pm

Cool!
Tim...
About Me

Locked