Thursday, June 13, 2013

Column Metadata in Talend Open Studio Components


Getting a list of columns in a Talend Open Studio custom component is a common task.  Use the getMetadataList() method on the Node to retrieve the schemas used for incoming and outgoing connections

IMetadataTable is the primary interface for working with the schemas associated with a Talend Open Studio component.  A Node -- the Talend component instance dragged to the canvas -- contains a list of IMetadataTables.  These IMetadataTables correspond to the schemas used in the incoming and outgoing connections.

Get IMetadataTable

If a component has a FLOW CONNECTOR listed first, the following lines of JET code will access the relevant IMetadataTable object.


List metadatas = node.getMetadataList();
IMetadataTable metadata = metadatas.get(0);


There are other methods belonging to Node that purport to retrieve metadata.  "getMetadataFromConnector()" takes a connector string and returns a single IMetadataTable record (not a list).  "getMetadataTable()" also takes a string as a key.  (I haven't used either of these yet. -Carl)

getListColumns()

The IMetadataTable object is a container for columns, manipulated through the IMetadataColumn interface. To get the columns associated with an IMetadataTable, use the following JET and Javacode.


<%
for( IMetadataColumn col : metadata.getListColumns() ) {
%>
   <%= filterRowName %>.<%= col %> = <%= rowName %>.<%= col %>;
<%
}
%>



The loop iterates over the columns defined in "metadata".  In this case, it's looking over the columns in the incoming schema.  The Java code inside the JET for loop is assigning an input row "rowName" to an output row "filterRowName".  It does the assignment column-by-column using "col".

If you're new to JET code, the above statement is processed twice.  The first stage is a source code generation that will create several lines of Java code for each column.  Each line of Java code will have generated values for filterRowName, col, and rowName.  For instance if rowName is "row1", filterRowName is "row2", and there is one column, "field1", the following code will be generated.


row2.field1 = row1.field1;


The second stage is a Java complation taking the source to a class file.  The job can then be run and the row1/row2 assignment takes place.

There are many methods for working with IMetadataTable and IMetadataColumn.  There is support for DBMS, read-only, and dynamic values.  However, the few methods presented in this post are the most common.

No comments: