Thursday, June 13, 2013

Nested XML From an RDBMS with Talend Open Studio


If you have a simple XML document -- say one that closely mimics a database schema -- the example in the Talend Open Studio Users' Guide is sufficient.  However, if your XML document contains repeating groups of nested elements, you'll need a few extra steps.

The following example is based on a simple relational structure where a Company table has a one-to-many relationship with a Contact table.  A Contact must refer to just one other Company.  This is realized in a Company Id kept on the Contact table.

Company and Contact Tables
The target XML representation is




 
   
     
     
   

 
 
   
     
   
 
 
   
     
   
 
 
   
     
   
 


The companies element is repeated, as is the contacts element.

To process the XML, start with an XML schema (XSD).  This link points to an XSD created using the Venitian Blind technique.  This produces a single top-level element (companies) with reusable types.  All other elements (company, contacts, contact) are nested within type definitions.

Start with a job that joins the source tables and writes the output to a tAdvancedFileOutputXML

Talend RDBMS to XML Job
Base the tAdvancedFileOutputXML on metadata created by importing the XSD.

Metadata from companies.xsd and Added Items
The Linker Target was formed by selecting the companies.xsd file in the previous step.  Several schema items in the left-hand panel were created.  The items CompanyName, FirstName, Suffix, etc. were created by using the Schema Management button.  The items were then mapped to Linker Targets.

Right mouse-click over the contact item and "Set as Loop Element".  On the company and contacts item, select "Set as Group Element".  This defines the repeating groups within the XML document and turns the normalized join into a nested structure.

Next, use a tMap to map the source RDBMS fields to the fields created from the XML metadata.  Note the FirstName field will render a firstName XML attribute.

RDBMS Mapping to Target XML
With a good XSD and a few components, you can use Talend Open Studio to render XML documents.  the XSD can be reused for other mappings, say to produce lighter-weight documents.  Also, other Talend components -- tFilterRow -- can be enlisted to restrict the data for the document.

No comments: