Thursday, June 13, 2013

Mapping the Auto Increment Column in Talend Open Studio


When you're working with an auto-incremented column in Talend Open Studio, you must be careful not to map the column if it's used as an output.
Databases like Microsoft Access use auto-increment columns to provide for unique record identifiers.  This is in contrast to databases like Oracle which maintain a separate facility for sequences (ROWID aside).  When working with Talend, don't map a value to the column, otherwise a database error will the thrown.

(This takes the form of a "General error" referencing the ODBC driver if a tAccessOutput component is used.)

The metadata functions in Talend like the "Retrieve schema" button will include auto-increment columns.  For example

Generated from Retrieve Schema
To use this schema in the map, there are a few techniques.

  1. Abandon it.  In the TOS User Guide, the schema used in the tMysqlOutput component sample is created by hand in a tMap.  Connect the the tMap and press the Reset DB Types on the Edit Schema dialog.
  2. Tweak the Repository.  Drag the Repository's definition on the canvas.  Change to a Built-in and remove the auto-increment column. 
  3. Build the schema in the tMysqlOutput component.  The opposite of the "Abandon it" where the schema is pushed backwards onto a tMap.
Access Output with Built-in Schema
This screenshot is tweaking Repository metadata.  The schema started with an ETL_BUDGET_ID column that was removed.
Schema without Auto-Increment Column
 If you're working with a database output with an auto-increment column, be careful not to map a value for it.  Otherwise, there will be a DB error.

No comments: