Tuesday, June 11, 2013

Presenting an Operations Schedule with Talend and Jaspersoft: Part 1

Presenting an Operations Schedule with Talend and Jaspersoft: Part 1

[The first in a two part series on loading operations schedule data using Talend Open Studio and writing a report using Jaspersoft's iReport Designer]

A reader recently asked about manipulating schedule data for more efficient presentation.  I've worked with schedulesas part of the Open Travel Alliance which is an XML standard for sharing travel industry information.  In both cases, the data structures involved in the intake of the data weren't suited for the display.  Talend Open Studio and Jaspersoft can resolve the discrepancy.

Operations Schedule

The hours in which a business operates is important to customers.  In most cases, a business with a storefront will not operate 24x7, with weekend hours differing from weekday hours.  There may be different times within the week (closed Mondays).




To input information like this is a structured way -- that is, not echoing back text strings -- means that the presentation can be standardized across an applications, validation is easier, and back-end computations and analytics can be used.  A UI that might capture this information would look like this:
An Operations Schedule UI Helps Maintain Structured Data
This spreadsheet contains a dataset that could have been created by a UI such as this.  Each checkbox provides a boolean value and open and close time are strings entered from a combobox.  This data can be stored or transmitted.

Operations Schedule Data
The hours of a restaurant associated with a hotel are presented with each row representing a single day.

Target Presentation

Even though the operations schedule data is structured, we'd still like to display the information in a concise format for the user.  Instead of
  • Monday 09:00 - 17:00
  • Tuesday 09:00 - 17:00
  • Wednesday 09:00 - 17:00
  • Thursday 09:00 - 17:00
  • Friday 09:00 -17:00
  • Saturday 10:00 - 17:00
  • Sunday 10:00-17:00

We'd like
  • Monday - Friday 09:00 - 17:00
  • Saturday and Sunday 10:00-17:00
Using an RDBMS

Consolidating data in this fashion is easy if the data is in a relational database.  The data can be grouped by key (hotel, restaurant), open time, and close time.  This reduces the number of lines in the previous paragraph from 7 down to 2.

An additional manipulation is required in order to support the group by operation.  This is to convert the boolean values into a numeric which makes the values available to the SQL aggregate operation "MAX".  This is a workaround because there is no SQL aggregate operation for logic like "OR".

Schema Replacing Boolean Values with Numerics for MAX() Function


Talend Open Studio

If the input data is not provided directly by an application, say it's provided through an external interface, you should load your data into an RDBMS.  To transform this input data from a spreadsheet into something ready to be grouped is easy with Talend Open Studio.  A simple 3-component job will work.

Talend Job to Load Spreadsheet into Access
The tMap maps the fields of the spreadsheet into an Access table "HoursOfOperation".  The tMap reconciles the two different sets of field names and applies a conversion of boolean values to a numerics using the ternary (?:) operator.

tMap Converting Boolean Values to Numerics
When working with Access, I use an auto-increment column.  This requires that I prevent Talend from writing this column in one of two ways: removing the column from the schema or using the Field Options panel.  I vastly prefer using the Field Options panel because it allows me to continue to base my output component on a repository schema rather than a hacked-up version.

Notice that Updatable and Insertable are unchecked.

Preventing an Insert on an Auto Increment Column

The result of running the Talend Open Studio job is the following.  All of the Talend Open Studio and Access types are String or Text, but Access's display seems to coerce the openTime and closeTime columns into a date.

Data Loaded and Transformed into an RDBMS
If you are working directly with the application data and not receiving it as input, you won't need a Talend Open Studio job.  It may be worthwhile to provide a SQL view of the data as shown in the prior screenshot in this case.  Part 2 will show a Jasper Report producing a readable display of this inforrmation and having the boolean/numeric conversion readily available will make the report's query easier to write.

No comments: