Thursday, June 13, 2013

El Cid in Talend Open Studio Components


In order to prevent name collisions or duplicate variable definitions in custom Talend Open Studio components, append a component's unique name to variable definitions

Talend uses Eclipse's Java Emitter Template (JET) to generate source code for Talend jobs.  This is used for both Java and Perl code.  Using a JET expression element ("<%= %>"), you can form a unique variable name based on the code generator arguments.  This technique appears in all of the standard components.

This is a block of JET code found in many components

<%
CodeGeneratorArgument codeGenArgument = (CodeGeneratorArgument) argument;
INode node = (INode) codeGenArgument.getArgument();
String cid = node.getUniqueName();
%>


CID in Variable Names

'cid' is a unique id established when a component is dragged on the canvas.  The component retains this id even if it is renamed.  'cid' can be used within JET code and in Java.  From a tMysqlSP JET section

java.sql.Connection connection_<%=cid%> = (java.sql.Connection)globalMap.get("<%=conn%>"); 


These statements prevent a name collision with another component, especially if all components follow the standard.  If the cid isn't appended, it would be too easy for many different components to lay claim to a general "connection" variable.

This is often used to define component-wide variables that handoff a data structure between the begin and main parts.

Additional Expressions for Main

Variables declared in main may need additional name protection.  That's because for classes, multiple connectors, or iterations involving blocks of Java code outside of JET, the blocks may be repeated.  And you can't declare even a name-scoped variable like connection_<%=cid%> twice.  The following block (outside of JET) is from tFileOutputMSDelimited.

<%=incomingName %>Struct <%=incomingName %>_<%=cid %> = new <%=incomingName %>Struct(); 


Multiple JET expressions are used to form Java identifiers.

Blocks of Code

Another way to protect your variables is applying Java curly braces.  This can be done without loops or other keywords.  Simply wrap a block of code in braces and the variables declared will fall out of scope at the end and won't cause problems with other declarations.

JET is a templating language that generates Java code for use in Talend Open Studio jobs.  The interaction between JET and Java as well as Begin and Main may require adding a <%=cid%> throughout variable declarations to prevent name conflicts, especially important considering the many components that come with Talend.

Links to JET

http://www.eclipse.org/articles/Article-JET/jet_tutorial1.html

http://www.eclipse.org/articles/Article-JET2/jet_tutorial2.html

No comments: