Tuesday, June 18, 2013

How to pass Context parameters to Talend Job through command line

Today, I am going to show, How to pass the values of the context parameters or variables to the Job once the Job has been exported as executable (batch or shell). In the earlier post (mentioned below) I have shown, How to create context variables and pass values to the context variables with in job. Click on the respective link to know more:

We can export Talend job as stand alone executable in the form of batch or shell scripts. Talend allows us to pass the values of context variables defined in the job from command line. This option/feature comes handy when we have a job where business rule keeps on changing. We can easily parameetrize the parameter whose value can be changed and provide the value of parameter through command line. So How can we pass the values from command line.

There are two ways to pass the values of context variables through command line.

1. Using Context Group - We can create different context group in the Talend Job and set the values of the context variables according to these groups. e.g. we can create a context group for Dev (Development environment) , Prod (Production environment) and provide different values to context variables based on the environment. From Talend Job, we can change the context group from Run view as shown below.



Similarly, Use the following command to change the context group of the Talend Job through command line:

For using context group Prod defined in the Job:

C:\demo\context_demo> context_demo_run.bat --context=Prod

For using context group Dev defined in the Job:

C:\demo\context_demo> context_demo_run.bat --context=Dev

If we do not provide the context group then Job will automatically use the default context group defined in the Job.

For using default context group defined in the Job:

C:\demo\context_demo> context_demo_run.bat

2. Overriding Context Variables - We can also override the values of the individual context variables/parameters through command line. We can pass the argument --context_param followed by a name of the parameter defined in the Job and its value to override any of the parameters in the context. Use the following command to override the value of context parameter or variable.

C:\demo\context_demo> context_demo_run.bat --context_param <param-name>=<param-value>

For example: To override the value of context parameter DB_NAME to localhost

C:\demo\context_demo> context_demo_run.bat --context_param DB_NAME=localhost

I have created a simple Talend Job to demonstrate both methods of passing command line parameters:



This Job which will take the values from context parameters and display in Run console.
tFixedFlowInput will be used to put the context parameter values in the Job flow and tLogRow will help to display the values in Run console.

I have created two context parameters param1_origin and param2_origin. I have set the default value as "Inside Job" for both the parameters.


Now, Lets export and run the job through command line:

# First run it without any parameters overriding

C:\demo\context_demo> context_demo_run.bat


Notice that the default value is being displayed.

# Its time to override first context variable param1_origin

C:\demo\context_demo> context_demo_run.bat --context_param
param1_origin="Command Line"



Notice that value from command line has over ridden the default value.

#Now I will pass value of both the context variables through command line:

C:\demo\context_demo>context_demo_run.bat --context_param param1_origin="Command Line--context_param param2_origin="Command Line"


Now, Modify the same Talend job and  create context group in the Talend Job to demonstrate How to specify which context group to run through command line.


(Note I have deleted the Default context group and set Dev as default)

Now we have two context group Dev and Prod in the Job. Lets import and run the Job though command line.

# Lets first run it without mentioning any context group

C:\demo\context_demo> context_demo_run.bat



Notice that the value from default context group Dev has been populated.

# Now, specify the Prod context group in the Job and pass it through command line

C:\demo\context_demo> context_demo_run.bat --context=Prod



You can also override the value from context group by mentioning the --context_param in the command.

C:\demo\context_demo> context_demo_run.bat --context=Prod --context_param param1_origin="Command Line"



This is How we pass the values to the context variables defined in the Job through command line. Please let me know, what are the best practices you follow when you define context variables and pass the values through command line. Looking for your comments.

Pass parameters and context variables to Child jobs in Talend Open Studio

In this article, I am going to show you, How to pass the context variables and parameters from main Job to child sub job using Talend open studio. 

Talend provides us a very good way of inter-component communication using parameters. We can easily pass context parameters from one job to another. This feature is very handy as we can create all the parameters in the main job and as per requirement pass the values of parameters to the sub jobs.




First create a child job :

1. Define three context variables as below and do not provide any default values.

 

2. Drag tJava component and print the values of the context variables defined in the Step 1.
 

tJava component properties

 

 3. Execute the child Job.



As we have not provided any values to context variables in child job, It is not displaying any values in the output.

Now, its time to create the Main job and pass the values to the subjob.

4. Create a Main job like below. 

Use tFileInputDelimited component read the delimited Input file and Drag the child job to the Job designer.



Input File: 

EMP_ID;EMP_NAME;EMP_SALARY
101;Mark THomas;20,000
102;William Crow;53,000
103;Ramanujan K.;89,000
104;Stacy Wind;24,000

5. Open the settings of the child subjob and pass the values of the context variables defined in child job as below. Take the values of the fields from row2 link and pass it to context variables.



6. Main job is complete. Its time to execute it.



Now, you can see that child job is running iteratively for every record in the Input file and value is being displayed in the run console.

Understand Context variables with Examples - Part 3 (Talend Open Studio)

In the earlier posts we have provided the value of the context variables either in Talend Job itself or defined the values in Repository context variables. Today I am going to create a Talend job which will filter the records based on the values of the context variables placed in the file. We will use Talend’s tContextLoad component.

I am going to use the same input file which I have used in my previous post on repository context variables. Lets look at the input file. 



I have formatted the input file as below to make it easy to read.


I will create two context variables; context_dept_id and context_salary.

Lets get started and create a new talend job and follow the below mentioned steps:

1. Create metadata for our Input delimited employee file. You can take reference of the steps to create metadata for delimited from here. Your metadata should look like following:

2.Drag the metadata created in step 1 to Job designer pane and select tFileInputDelimited component from the Popup.

3. Drag tFilterRows and tLogRow components from Palette pane. Component tFilterRows will be used to filter the records based on the values of the context variables and tLogRow will be used to display the output in output console.

4. Right click tFilterRows, select Row > main and connect it to tLogRow.

As of now, we have not defined any context variable. Lets run the job and observe our output.

Output without context variable: 


As you can see that, all of the input file records have been displayed in the output as we have not defined and used context variable in the tFilterRows component.

Now lets define context variables. Open the Contexts pane.


5. Add two context variables as we did in the last post; context_dept_id and context_salary.

Also, do not provide any default values to the context variables.

6. Now we have created these context variables, its time to use them to filter the input records. Open the component properties of the tFilterRows and add two conditions as shown below. In the value section provide the name of the context variables. 


Remember: We have not provided values to the context variables. As of now we have only defined these variables.

7. Try to execute the Job and observe what happens.




It will throw exception “Exception in component tFilterRow_1 : java.lang.NullPointerException”.

This is because we have not provided any values to the context variables. In the earlier posts we have provided the value of the context variables either in Talend Job itself or defined the values in Repository context variables. Today we are going to provide the values of the context variables in the file. This can help us dynamically provide values of the variables. 

Context variable name and its values should be present in the key value pair in the file. As per our example I have provided the values of context variables as mentioned below. Note: Name of the context variable in the file should match with the name of the context variable defined in the Job. Below is the screenshot of our context variable file. 



8. In the current Job Drag tFileInputDelimited component from Palette Pane. This component will be used to read the context variable file. Create the metadata of this component as following and provide the path of the context variable file in the File Path.


9. Now drag tContextLoad component from Palette pane and place it onto job designer. This component will set the value of the context variables defined in the job from the file.

10. Right Click tFileInputDelimited component , select Row>Main and connect it to tContextLoad component. This will create a new subjob.

11. Right click tFileInputDelimited component and Select Trigger>OnSubjobOk and connect it to tFileInputDelimited component which is reading our actual file. This will allow the job to first load the context variable values from file and then run the actual job.

12. Now try to execute the job and check the output.



You can see that only those records are present in the output where department is 12 and salary is greater than 10000. You can compare it with input file (screenshot below).

13. Now change the values in the context variables in the file as below.



You can see that only those records are present in the output where department is 10 and salary is greater than 10000. You can compare it with input file. You can find the snapshot of US input file below:


This end the 3 article tutorial on How to define and use the context variables. Links to other two (Part 1 and Part 2) articles are mentioned below:


Understand Context variables with Examples - Part 1 (Talend Open Studio)
Understand Context variables with Examples - Part 2 (Talend Open Studio)

Understand Context variables with Examples - Part 2 (Talend Open Studio)

In the previous post here, I have shown you, How to define and use context variable in the Talend Job. Today, I am going to define context variables in the Repository as metadata. You can also call these variables as Repository context variables. We define context variable in the repository if the context variables needs to be available to multiple jobs in the project. You can access the values of these variables in multiple Talend Jobs.

Today I am going to create a Talend job which will filter the records based on the repository context variable values. Lets look at the input file.



I have formatted the input file as below to make it easy to read.


I will create two context variables; context_dept_id and context_salary

Lets get started and create a new talend job and follow the below mentioned steps:

1. Create metadata for our Input delimited employee file. You can take reference of the steps to create metadata for delimited from here. Your metadata should look like following: 


2.Drag the metadata created in step 1 to Job designer pane and select tFileInputDelimited component from the Popup.

3. Drag tFilterRows and tLogRow components from Palette pane. Component tFilterRows  will be used to filter the records based on the values of the context variables and tLogRow will be used to display the output in output console.

4. Right click tFilterRows, select Row > main and connect it to tLogRow.

As of now, we have not defined any context variable. Lets run the job and observe our output.


Output without context variable: 

99designs.com 



As you can see that, all of the input file records have been displayed in the output as we have not defined and used context variable in the tFilterRows component.

Now lets define context variables.

5. Right click Contexts tab in Repository pane and click on Create context group.


6. In the Popup window, enter the name of the context variable group name. Click Next
.
7. In step 2, Click on (+) button to add two context variables context_dept_id and context_salary.


8. Navigate to “Value as tree” tab and select prompt checkbox for both the variables. This will allow us the set the value of the context variable from the prompt at runtime.


9. Now we have successfully defined our context variables in the repository. However these variables are not present in our job. You can check this by opening the Contexts pane of the Job.

10. To make available these context variables to our job, Drag the repository context group to our job. Now again open the Contexts pane and you can see the variables available to our job. 



11. Since these context variables are available to our job, its time to use them to filter the input records. Open the component properties of the tFilterRows and add two conditions as shown below. In the value section provide the name of the context variables.

This completes our job. its time to run and check the output.


12. Run the Job and You will be presented with a prompt window where we can provide the value for both the context variables.


Enter “10” for context_dept_id and “50,000” for context_salary.

Click OK.

Output



You can see that only those records are present in the output where department is 10 and salary is greater than 50000. You can compare it with input file. You can find the snapshot of US input file below:

13. Run this job again and Enter “12” for context_dept_id and “10,000” for context_salary.



You can see that only those records are present in the output where department is 12 and salary is greater than 10000. You can compare it with input file (screenshot below).

 

This is How, we define and use context variables in the Repository as metadata. We can use these context variables in multiple talend job. You have to drag the repository context variables in the Job to make it available. In the next article, I will show you, How to define context variable in Flat files which can be loaded in the job at runtime using tContextLoad component