Tuesday, June 11, 2013

Handling Database Warnings with Talend Open Studio

Handling Database Warnings with Talend Open Studio

To handle nonfatal database warnings with Talend Open Studio, use a Run If connector routed to a tWarn component.  Use a global exception handler pattern -- tLogCatcher -- to save off the error message and email it.

This job loads records into a SQL Server table from a spreadsheet.  The database has a unique constraint on 'businessName'.  If you run the job once, 5 records are inserted.  If you run the job a second time on the same input, an error is generated.

Job Featuring tWarn and a Run If to Report DB Errors
 The generated error is coming from the database.  If the "Die on error" box were checked on the tMSSqlOutput , processing would stop.  In this case, processing continues as shown by the "after loading subjob" messages which comes from the tJava_1.

Error is Logged, but Processing Continues
Run If


The job uses a global exception handler subjob (tLogCatcher, etc) that can be reused among the flows in the job.  The mechanism by which the tMSSqlOutput component communicates with the exception handler is tWarn.  tWarn is invoked by logic in the Run If connector.

Logical in the Run If Connector
Walkthrough

For a detailed walkthrough of the job, watch this video.

http://www.youtube.com/watch?feature=player_embedded&v=Gz88nlQl-HU

Getting More Info

If you're willing to turn off batching, you can set up a rejects flow from the tMSSqlOutput, spooling rejected records off to a file.  The following revision produces the file attachment as part of the data flow (Excel -> MS Sql).  Batching MUST be turned off in order for the Rejects connector to be an option in the job.  Uncheck the "Use Batch Size" checkbox on the Advanced Settings tab.

Spool Rejected Records to a File; Run If Triggers a Warn
If there are no rejected records, the tWarn is not called an no email is sent.  If there are rejected records -- in this case coming straight from a database constraint error -- they accrue in a file.  The _REJECTED variable is consulted on a Run If which will trigger the tLogCatcher (and the tSendMail component).  The tJava print out ("after loading subjob") is still invoked since processing continues.

 The Run If connector provides a way to check for a result.  This example looked at the _INSERTED variable of the tMSSqlOutput component. The logic could be expanded to AND or OR other conditions such correlating a row count with the number of inserts.  tWarn can send the result of the check to a globally-defined exception handler.  That way, multiple tests can be done in the Talend job all using the same exception code.

No comments: