Tuesday, June 18, 2013

Understand Context variables with Examples - Part 1 (Talend Open Studio)

Context variables are user defined variable provided by Talend Open studio whose value can be changed at run time. We can provide the values of the context variables at runtime which allows jobs to be executed in different ways with different parameters. For Example, we can define different context variables for Production & Test and based on the environment the job is running it will set the variable values.

Let us suppose that our job has to connect to a database, the connection details for development and production databases are different. We could create two copies of the job, each one with different connection details configured, but this approach will duplicates code and makes job maintenance more difficult. Better approach would be to create a single job and allow it to run with different connection details depending upon the database you want to target at any given time. 

We can create context variable in three different ways. We can define the context variables inside the Talend Job, in the Repository as metadata and we can also define the context variables in the flat files which can be read and loaded to job at the runtime using tContextLoad component.

In this post, I will demonstrate a Job which will use context variables defined in the Job. In the next article, I will cover other two ways of defining context variables. I will be developing a Talend Job which will pick the input delimited file dynamically from the context variables and also filter records from the context variable. I will create two context variables country_id and amount. 

country_id will be used to fetch the input file dynamically based on the context variable value. 
E.g. If the value of the context variable country_id is IND then the context_example_IND.csv
file will be read and processed as Input file, Similarly, If the value of the context variable country_id is US then the context_example_US.csv file will be read and processed as Input file. 

Context variable, amount will be used to filter the records based on the value provided to the context variable. I will create this variable as Prompt context variable, so that we can provide the value once we execute the talend job.

Lets look at the Input Files. I have created two input files one for US and other for India. Names of the files are context_example_US.csv and  context_example_IND.csv. 
t


To achieve our objective, lets create a new job and perform following steps.

1. Drag 
tFileInputDelimited component onto Job Designer pane from Palette pane.

2. Open the component properties of the component and Click on Edit schema. As per our input file, create metadata like following.
3. Open the Contexts pane and add one context variable with name country_id. We will use this context variable to pick the different file at runtime based on country id.



  
4. In the Contexts pane click on value as table and then click on  button. A popup window will appear. 


5. Click on New and add two variables as India and US. Select the default and click on Remove. This will remove the default and make India variable as default.


6. In the “Value as table” tab in the Contexts pane and provide the value as “IND” for India and “US” for US. 


With this Step, we have created context variable and also provided the values to them.

7. Now open the tFileInputDelimited component to provide the File Name so that it will fetch file dynamically based on the context variable. Provide following value to File Name/Stream text box.

"D:/TalendFiles/Blog Files/context_variables/context_example_"+context.country_id+".csv"

If the value of the context variable country_id is IND then the file path will be generated as 

D:/TalendFiles/Blog Files/context_variables/context_example_IND.csv

Similarly, If the value of the context variable country_id is US then the file path will be generated as 

D:/TalendFiles/Blog Files/context_variables/context_example_US.csv

8. Now Drag tLogRow component from Palette pane. This will help us to display the output in Run console.

9. Right click tFileInputDelimited, select Row > Main and connect it to tLogRow component.





10. Its time to Execute our Job. Go to Run pane and from the context dropdown in the right side select US. and click on Run.


We have run our job with value of context variable country_id as “US”. You can see that it has dynamically fetch context_example_US.csv file. You can compare it with input file for US. You can find the snapshot of US input file below:


11. Run it again and this time select India from the context dropdown in the Run pane.

We have run our job with value of context variable country_id as “IND”. You can see that it has dynamically fetch context_example_IND.csv file. You can compare it with input file for IND.


This is How we can use context variables to fetch the files dynamically. We can also use context variables to filter the records in the Job. Lets modify the Job to demonstrate this as well.

1. In our job, Drag tFilterRows component and place on the link between tFileInputDelimited and tLogRow components.


2. Next step is to define the context variable amount which will help us to filter the records based on the order amounts. Open the Contexts pane and create a new variable as amount.

3. Navigate to “Value as tree” tab in the context pane and click on checkbox for amount for context India and US in the Prompt section.


This will allow us the set the value of the context variable from the prompt at runtime.

4. Now open the component properties of tFilterRows component to provide the context variable to filter the rows. 

5. Click on  button to add a new condition. In the InputColumn select order_amount, operator as Greater than and provide value as context.amount.



6. Its time to Execute our Job. Go to Run pane and from the context dropdown in the right side select US. and click on Run.

This time we are presented with a prompt window where we can provide the value for amount context variable. Lets, provide the value as 15000 and click OK





We have run our job with value of context variable country_id as “US” . You can see that only those records are present in the output where order_amount is greater than 15000. You can compare it with input file for US. You can find the snapshot of US input file below:

US

7. Run it again and this time select India from the context dropdown in the Run pane. 

Again, we are presented with a prompt window where we can provide the value for amount context variable. Lets, provide the value as 50000 and click OK.



We have run our job with value of context variable country_id as “IND” . You can see that only those records are present in the output where order_amount is greater than 50000. You can compare it with input file for IND. You can find the snapshot of US input file below:



This is How we define and use context variables in the Talend Job. In the next article, I will show you, How to define context variable in Repository and in Flat files which can be used across the jobs in a Project.

No comments: