8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
Parse XML Documents
Related articles.
The use of XML for data transfer has increased dramatically over the last few years. Since most applications still rely on
structured relational data it is necessary to explode the non-structured XML data into relations tables. Oracle9i Release 2 includes
the XDK for PL/SQL packages loaded into the SYS schema by default, but these have been superseded by a set of integrated DBMS_%
packages within the XDB schema. I shall present a simple example of using the new XDB packages to load employee records into
the EMP
table.
First we create a directory object pointing to the source XML file and the EMP
table which is the final
destination of the data.
-- As SYS CREATE OR REPLACE DIRECTORY xml_dir AS '/tmp'; GRANT READ ON DIRECTORY xml_dir TO <user-name>; -- As schema owner CREATE TABLE emp ( empno NUMBER(4), ename VARCHAR2(10), job VARCHAR2(9), mgr NUMBER(4), hiredate DATE, sal NUMBER(7, 2), comm NUMBER(7, 2), deptno NUMBER(2) );
The source of the data is the emp.xml document. Extracting the data from the document involves several steps. First the XML document must be parsed and a DOMDocument created from it. Once the DOMDocument is created the parser is no longer needed so it's resources can be freed.
l_parser := DBMS_XMLPARSER.newparser; DBMS_XMLPARSER.parseclob(l_parser, l_clob); l_doc := DBMS_XMLPARSER.getdocument(l_parser); DBMS_XMLPARSER.freeparser(l_parser);
Next the XPATH syntax is used to get a DOMNodeList containing all the EMP nodes.
l_nl := DBMS_XSLPROCESSOR.selectnodes(DBMS_XMLDOM.makenode(l_doc),'/EMPLOYEES/EMP');
Once we have the DOMNodeList we can loop through it getting the values for each node in turn. The values are returned using the XPATH sytax and placed in a table collection.
l_n := DBMS_XMLDOM.item(l_nl, cur_emp); DBMS_XSLPROCESSOR.valueof(l_n,'EMPNO/text()',t_tab(t_tab.last).empno);
With all the data retrieved into the table collection the inserts can be performed. Putting it all together we get this.
DECLARE l_bfile BFILE; l_clob CLOB; l_parser DBMS_XMLPARSER.parser; l_doc DBMS_XMLDOM.domdocument; l_nl DBMS_XMLDOM.domnodelist; l_n DBMS_XMLDOM.domnode; l_temp VARCHAR2(1000); l_dest_offset INTEGER := 1; l_src_offset INTEGER := 1; l_bfile_csid NUMBER := 0; l_lang_context INTEGER := 0; l_warning INTEGER := 0; TYPE tab_type IS TABLE OF emp%ROWTYPE; t_tab tab_type := tab_type(); BEGIN l_bfile := BFILENAME('XML_DIR', 'emp.xml'); DBMS_LOB.createtemporary(l_clob, cache=>FALSE); DBMS_LOB.open(l_bfile, dbms_lob.lob_readonly); -- loadfromlob deprecated. -- DBMS_LOB.loadfromfile(l_clob, l_bfile, dbms_lob.getLength(l_bfile)); DBMS_LOB.loadclobfromfile ( dest_lob => l_data, src_bfile => l_bfile, amount => DBMS_LOB.lobmaxsize, dest_offset => l_dest_offset, src_offset => l_src_offset, bfile_csid => l_bfile_csid , lang_context => l_lang_context, warning => l_warning); DBMS_LOB.close(l_bfile); -- make sure implicit date conversions are performed correctly DBMS_SESSION.set_nls('NLS_DATE_FORMAT','''DD-MON-YYYY'''); -- Create a parser. l_parser := DBMS_XMLPARSER.newparser; -- Parse the document and create a new DOM document. DBMS_XMLPARSER.parseclob(l_parser, l_clob); l_doc := DBMS_XMLPARSER.getdocument(l_parser); -- Free resources associated with the CLOB and Parser now they are no longer needed. DBMS_LOB.freetemporary(l_clob); DBMS_XMLPARSER.freeparser(l_parser); -- Get a list of all the EMP nodes in the document using the XPATH syntax. l_nl := DBMS_XSLPROCESSOR.selectnodes(dbms_xmldom.makeNode(l_doc),'/EMPLOYEES/EMP'); -- Loop through the list and create a new record in a tble collection -- for each EMP record. FOR cur_emp IN 0 .. dbms_xmldom.getLength(l_nl) - 1 LOOP l_n := DBMS_XMLDOM.item(l_nl, cur_emp); t_tab.extend; -- Use XPATH syntax to assign values to he elements of the collection. DBMS_XSLPROCESSOR.valueof(l_n,'EMPNO/text()',t_tab(t_tab.last).empno); DBMS_XSLPROCESSOR.valueof(l_n,'ENAME/text()',t_tab(t_tab.last).ename); DBMS_XSLPROCESSOR.valueof(l_n,'JOB/text()',t_tab(t_tab.last).job); DBMS_XSLPROCESSOR.valueof(l_n,'MGR/text()',t_tab(t_tab.last).mgr); DBMS_XSLPROCESSOR.valueof(l_n,'HIREDATE/text()',t_tab(t_tab.last).hiredate); DBMS_XSLPROCESSOR.valueof(l_n,'SAL/text()',t_tab(t_tab.last).sal); DBMS_XSLPROCESSOR.valueof(l_n,'COMM/text()',t_tab(t_tab.last).comm); DBMS_XSLPROCESSOR.valueof(l_n,'DEPTNO/text()',t_tab(t_tab.last).deptno); END LOOP; -- Insert data into the real EMP table from the table collection. FORALL i IN t_tab.first .. t_tab.last INSERT INTO emp VALUES t_tab(i); COMMIT; -- Free any resources associated with the document now it -- is no longer needed. DBMS_XMLDOM.freedocument(l_doc); EXCEPTION WHEN OTHERS THEN DBMS_LOB.freetemporary(l_clob); DBMS_XMLPARSER.freeparser(l_parser); DBMS_XMLDOM.freedocument(l_doc); RAISE; END; /
Issues
At the time of writing this article there are a few bugs/issues relating to the new integrated XML implementation on NT/2000.
- The "/text()" reference is necessary when using the
dbms_xslprocessor.valueOf
procedure. This bug has been fixed in 9.2.0.3.0 so these references can be removed. - When using the "/text()" reference, if a tag contains no text value an "ORA-03113: end-of-file on communication channel" error is reported and the session is killed. This reference is not necessary in 9.2.0.3.0.
- In 9.2.0.3.0 the
SRC_BFILE
parameter inDBMS_LOB.LOADFROMFILE
should be replaced bySRC_LOB
. - Parsing a document with an NT/2000 style path gives a "ORA-29280: invalid directory path" error. If all "" references are replaced by
"/" the document is parsed corretly. Since the
UTL_FILE
package works properly I can only assume the parse procedure is splitting the document path incorrectly. To avoid this issue I've used a directory object to specify the path and loaded the XML document via a CLOB.
Check your current platform for these issues before launching into any development. Until these issues are resolved I would suggest sticking with the Oracle8i method described on this site.
For more information see:
- Oracle9i XML Database Developer's Guide - Oracle XML DB
- DBMS_XMLPARSER
- DBMS_XMLDOM
- DBMS_XSLPROCESSOR
- XMLTABLE : Query XML Data From SQL
- Parse XML Documents in Oracle 8i
Hope this helps. Regards Tim...