If you have an XML document based on a schema that requires transformation, consider using XPaths in Talend Open Studio to flatten the hierarchical file for loading.
Consider the following XML fragment.
The fragment lacks "patientId" and "firstName" elements.
XPaths can be used to flatten this by mapping the attribute "hrediattribute" values to different columns based on the name attribute. (That's the "name" attribute of the element "hrediattribute".)
For patientId, the XPath would be
hrediattributes/hrediattribute[@name='patientId']/@value
And for firstName, the XPath would be
hrediattributes/hrediattribute[@name=firstName]/@value
If the hrediattribute elements have their values in the element body (11111
File XML in Talend Open Studio
The following screenshot is from Talend Open Studio's File XML Wizard. The File XML is used as an Input XML and is available here.
To test this, I dragged the File XML onto the canvas as an input and hooked up a tLogRow. Here are the results from the run.
Starting job HREDIXMLFile at 08:47 30/05/2011.
[statistics] connecting to socket on port 3547
[statistics] connected
111111|Carl
222222|Joe
[statistics] disconnected
Job HREDIXMLFile ended at 08:47 30/05/2011. [exit code=0]
If you need to transform your XML document, consider using XPaths in Talend Open Studio rather than a separate XSL. Although you can call the XSL transformation from TOS, that won't take advantage of the TOS' browsing and dependency checking.
Specifying an XSD
Although the File XML wizard is labeled "File Settings / XML" (TOS 4.2.1), an XSD can be entered. The XSD must be a local file. However, make sure that any references within the XSD are web resources and not local files. If the XSD imports another XSD namespace, the schemaLocation should to something accessible on the web and not another local file.
A Second Example
If some of the enclosing parent elements have data that needs to be mapped, additional xpaths are required. Because the xpaths are referencing elements outside of the loop, the relative xpaths in the first example won't work with out backing up (../..) or using absolute paths.
This example also needs transId mapped. An absolute path selecting all transIds is used (//@transId).
111111
Carl
The full XML file is here.
In order to dig into the specifics instances of the hredielements, additional attribute selectors are used
hredielement[@name='patient']/hrediattributes/hrediattribute[@name='patientId']
This is a screenshot of the mappings entered into File XML metadata.
Loop Element
The position of the loop element will determine the repeating rows. Using a loop element on "contact" on the following XML
Consider the following XML fragment.
The fragment lacks "patientId" and "firstName" elements.
XPaths can be used to flatten this by mapping the attribute "hrediattribute" values to different columns based on the name attribute. (That's the "name" attribute of the element "hrediattribute".)
For patientId, the XPath would be
hrediattributes/hrediattribute[@name='patientId']/@value
And for firstName, the XPath would be
hrediattributes/hrediattribute[@name=firstName]/@value
If the hrediattribute elements have their values in the element body (
File XML in Talend Open Studio
The following screenshot is from Talend Open Studio's File XML Wizard. The File XML is used as an Input XML and is available here.
To test this, I dragged the File XML onto the canvas as an input and hooked up a tLogRow. Here are the results from the run.
Starting job HREDIXMLFile at 08:47 30/05/2011.
[statistics] connecting to socket on port 3547
[statistics] connected
111111|Carl
222222|Joe
[statistics] disconnected
Job HREDIXMLFile ended at 08:47 30/05/2011. [exit code=0]
If you need to transform your XML document, consider using XPaths in Talend Open Studio rather than a separate XSL. Although you can call the XSL transformation from TOS, that won't take advantage of the TOS' browsing and dependency checking.
Specifying an XSD
Although the File XML wizard is labeled "File Settings / XML" (TOS 4.2.1), an XSD can be entered. The XSD must be a local file. However, make sure that any references within the XSD are web resources and not local files. If the XSD imports another XSD namespace, the schemaLocation should to something accessible on the web and not another local file.
A Second Example
If some of the enclosing parent elements have data that needs to be mapped, additional xpaths are required. Because the xpaths are referencing elements outside of the loop, the relative xpaths in the first example won't work with out backing up (../..) or using absolute paths.
This example also needs transId mapped. An absolute path selecting all transIds is used (//@transId).
The full XML file is here.
In order to dig into the specifics instances of the hredielements, additional attribute selectors are used
hredielement[@name='patient']/hrediattributes/hrediattribute[@name='patientId']
This is a screenshot of the mappings entered into File XML metadata.
Additional XPaths Example |
The position of the loop element will determine the repeating rows. Using a loop element on "contact" on the following XML
produces
HUXLEY INDUSTRIES, INC.|David|King
HUXLEY INDUSTRIES, INC.|Sybil|Bedford
with the companyName repeating. If the loop element is "company" instead -- and companyName, firstName, lastName is mapped -- then only the first row (with "David King") would be displayed.
No comments:
Post a Comment