Our Blog

Pentaho Reporting Export Options

Jagdeesh SS

This is a part of a series of blogs that provide quick tips and tricks to enable better usage of Pentaho. We start with this piece on improvising your Pentaho reporting export options. Have a look at the screenshot below.

Pentaho Reporting

This is a snapshot of a PRPTI report exported to excel. As you can see the date and timestamp reserve one entire row before the column headings come in. Certain consumers of the report may not want this display. This is because they want to economize real estate as well as to keep a focused span of attention on the data. Now have a look at the screenshot below.

Pentaho Reporting

As you can see the page count consumes an entire row of display, which is again inefficient usage of real estate. What if you do not want to have the date, timestamp and page count displayed?

Pentaho Reporting

Tags :

Clustering Pentaho BA Server 5.0.x version

Gangadhara Boranna

Clustering means that 2 or more instances of Pentaho share a common repository. Pentaho 5.0.X now uses the Jackrabbit Content Repository (JCR) for the BA Repository. Pentaho reporting related content, for instance, about reports that you create, examples we provide, report scheduling data, and audit data in the BA Repository. The BA Repository resides on the database that you installed. The BA Repository consists of three repositories: Jackrabbit, Quartz, and Hibernate.
– Jackrabbit contains the solution repository, examples, security data, and content data from reports that you use Pentaho software to create.

– Quartz holds data that is related to scheduling reports and jobs.

– Hibernate holds data that is related to audit logging.
Pentaho Consultation
You can choose to host the BA Repository on the PostgreSQL, MySQL, or Oracle database (by default, Pentaho software is configured to use the PostgreSQL Database). As already mentioned above that each node must have a shared repository, please find the instructions below for initializing and configuring your solution repository,

Initializing: http://infocenter.pentaho.com/help/topic/install_pdi/task_prepare_rdbms_repository.html

Configuring: http://infocenter.pentaho.com/help/topic/install_pdi/task_configure_rdbms_repository.html
You will need to add a section of the code to the repository.xml file found in \biserver-ee\pentaho-solutions\system\jackrabbit directory to allow each node to have a shared journal. Please note that each node must have a Unique ID. This will be explained in detail below. Configuring Each Node to have a Shared Journal: Before we start configuring shared journal, we would need to delete the files mentioned in the below directories,
– delete the contents of tomcat\work and tomcat\temp directories.

– Navigate to biserver-ee\pentaho-solutions\system\jackrabbit\repository directory and remove all files and folders from the final repository folder.

– Navigate to biserver-ee\pentaho-solutions\system\jackrabbit\repository directory and remove all files and folders from the workspaces folder.
Now, in order to configure nodes for a shared journal, we would need to edit the repository.xml file found in \biserver-ee\pentaho-solutions\system\jackrabbit directory. Add the below section of the code at the end.


Run with a cluster journal


<Cluster id=”Unique_ID “>

<Journal class=”org.apache.jackrabbit.core.journal.DatabaseJournal”>

<param name=”revision” value=”${rep.home}/revision.log”/>

<param name=”url” value=”jdbc:postgresql://HOSTNAME:PORT/jackrabbit”/>

<param name=”driver” value=”org.postgresql.Driver”/>

<param name=”user” value=”jcr_user”/>

<param name=”password” value=”password”/>

<param name=”databaseType” value=”postgresql”/>

<param name=”janitorEnabled” value=”true”/>

<param name=”janitorSleep” value=”86400″/>

<param name=”janitorFirstRunHourOfDay” value=”3″/>


You would need to replace the JDBC connection strings(URL, USERNAME, PASSWORD, DATABASE TYPE etc.,) to match to your specific database. Now Jackrabbit journalling is configured. Quartz will also need to be configured to avoid duplicate schedules created on each node.
Configuring Quartz for Cluster :
Navigate to \bi-server\pentaho-solutions\system\quartz and edit the quartz.properties file using a text editor. You will need to make the following changes in order to configure Quartz for cluster,

1. Org.quartz.scheduler.instanceId = AUTO

You will need to set it as AUTo because you can add multiple instances. The default value which would be set is 1.

2. org.quartz.jobStore.isClustered = true

The default value would be false.

3. org.quartz.jobStore.clusterCheckinInterval = 20000

You would need to explicitly add this in quartz properties file.

Tags :

Crosstabs De-mystified in Pentaho Reporting

Gangadhara Boranna

Crosstabs have been on the horizon for several years now. They lived a happy, undisturbed life along with the unicorns and gnomes guarding the pot of gold at the end of the rainbow. Crosstabs are an easy and relatively easy way to visualize tabular data along two or more axis. They are an indispensable part of a modern Pentaho Reporting engine.

Although crosstabs are still an experimental feature, and as such are not up to par with their counterparts in BO reports and the likes. They will not make it into 5.0.X as a stable feature, but even being experimental, they are already powerful enough to create some really useful reports.

How to create a Crosstab Report?

Step 1: In the report designer go to edit->Preferences. Here you need to enable the “Enable” (unsupported) experimental features

Pentaho Reporting

Step 2: Once clicked on ‘Apply’, you will be able to see the “crossTab” option in the left hand side features

Pentaho Reporting Designer

Step 3: Select, drag & drop the crosstab element into the reporting canvas. You can select the “Inline” or the “banded” option. Similar to Sub report.

Pentaho Reporting Services

Step 4: Now, create the data source for required for the crosstab report.

Pentaho Reporting

Step 5: Once done, select the required fields at the appropriate places.

Pentaho Reporting Process

Step 6: The fields will be populated on the reporting canvas. We can now apply the formatting on the reporting fields for a better view of the report.

Pentaho Reporting Designer

Sample report:

Pentaho Reporting Solution
Pentaho Reporting
Benefits of Cross-Tabs in Pentaho Reporting

Cross-tabs deliver data in a familiar spreadsheet format. They also summarize both vertically and horizontally, have a grid format, and can change size depending on the data.

  • Several of the most compelling reasons for using cross-tabs are
  • Making better use of space
  • Leveraging experience with the spreadsheet format
  • Horizontal expansion
  • Custom formatting

Because cross-tabs are grouped and summarized both vertically and horizontally, they are incredibly efficient at saving space as compared to a typical grouping report. They are very good at showing key information if the information required has at least two levels of grouping.

Tags :

Pentaho Launches Community Edition 5.0

Suresh Narayanappa

Pentaho Corporation recently announced the immediate availability of its Open Source Pentaho Community Edition 5.0 (Pentaho CDE). It is the latest version of business analytics and open source data integration platform. The launch event also included the Pentaho Marketplace, where members of the community can download and explore all available plug ins developed by the Pentaho Community. It has extended the capabilities of the open source platform and permits community members to work together, share feedback and submit or create new plug ins to broaden Pentaho functionality.

According to the company, the new edition offers an economical entry point to people or companies during their first brush with business analytics when they want to visualize and act upon data. It is also an excellent tool-set for practiced developers, users and consultants who prefer an open code base to extend the borders of Pentaho and also business analytics.
Pentaho Community Edition
New features in Pentaho Community Edition 5.0

Pentaho, in collaboration with its community of developers, has made a powerful tools suite that offers an open source option for data analysts and developers to meet their goals. The latest edition includes:

    • Business Analytics Platform: The modern, interactive and simplified approach of Pentaho helps business users to discover, access and blend all sizes and types of data. Users can take advantage of a wide range of advanced analytics, that range from simple reports to predictive modeling, and can analyze and also visualize data through multiple dimensions, while at the same time with minimum IT dependence.


    • Data Integration: Pentaho Data Integration, better known as Pentaho Kettle, delivers powerful transformation, loading and extraction capabilities. This is a stand-alone application and is utilized to visually design jobs, and aid easier reporting and analysis.


    • Report Designer: Pentaho Report Designer is a graphic design tool which has the capability to generate reports from the data streamed through Pentaho Data Integration engine with no requirement for any kind of intermediate staging tables. Output reports can be in PDF, HTML, XML, CSV, Excel and rich-text file.


  • Auxiliary Tools: Users can download various types of auxiliary tools, like Pentaho Aggregation Designer for a simple interface to first create and then deploy aggregate tables that improve the performance of the Pentaho OLAP Cubes. Mondrian Schema Workbench is the open source designer for the visual creating and testing of the Mondrian OLAP cube schemas. The Pentaho Metadata Editor offers a simplified tool that you can use to create reports, build domains of Pentaho Metadata or the relational data models.

Tags :