Handle Huge Data in ZK Pivottable

From Documentation
Revision as of 08:17, 19 December 2012 by Benbai (talk | contribs) (→‎Demo)
DocumentationSmall Talks2012DecemberHandle Huge Data in ZK Pivottable
Handle Huge Data in ZK Pivottable

Author
Ben Bai, Engineer, Potix Corporation
Date
December 14, 2012
Version
ZK 5 & later, Java 1.5, ZK Pivottable 2.0.0


Opening

ZK Pivottable is an Ajax data summarization component which takes raw data from a flat table and organize them by desired categories arranged on columns and rows. This article presents a way to handle huge data in ZK Pivottable by doing data paging manually. With this approach it can reduce memory consumption and speed up the processing time when the data is huge.

Scenario

In this small talk, we assume there are 100 sales persons selling 50 kinds of products to 300 customers who belongs to 15 different companies, and what we are interested in is the sum of the amount/price of sold products.

Table Schema

Pvt huge 20121105 table schema.png

Test Environment

For the ease of running, this sample project uses embedded hsqldb to store the test data.

Load All Data

The test data contains 600,000 records in the data base, if we load all data and put them into the pivot model at once, the memory consumption will reach over 1G and take about 8-16 seconds to render pivottable.

Loading data partially

So, in order to reduce memory consumption and processing time, we divide the data based on the value of the first level row node and control the paging by ourselves as below:

  • Perform a 'DISTINCT' query of the column in table with respect to the first level row node.
  • Get a sub list based on current page count and the page size.
  • Perform a 'SELECT' query to get data from DB with the condition 'the values of first level row node is IN the sub list'

In this case, the memory consumption will be much lower and the processing time will be slightly shorter.

Project Structure

Pvt huge 20121206 project structure.png

  • DataAccessObject.java: This class contains various methods to execute SQL query to load data from DB as needed.
  • PartialRenderComposer.java: This is the major part of this sample, control the custom paging and prepare raw data for pivottable.
  • PartialRenderComposerExt.java: All components that will display information in zul page are in this class, it is used to make PartialRenderComposer more clear. The fields' name (table column name) are also declared here.
  • DBHandleListener.java: Start database and prepare data as needed, it is a listener defined in web.xml so will be called automatically when container startup.
  • HSQLDataBaseConnector.java: Get a DB Connection.
  • WebContent/WEB-INF/dbfiles: All database files are in this folder, remove this folder to re-generate test data as needed.
  • index.zul: The test page.

Start DB, Create Table and Insert Data

As mentioned above, this sample uses the embedded hsqldb, the DB will be started in DBHandleListener#contextInitialized method:

public void contextInitialized(ServletContextEvent sce) {
	if (hsqlServer == null) {
		try {
			startDataBase();
			createDataTable();
			// un-comment startDBManager(); below to open hsqldb swing DB Manager as need
			// startDBManager();
		} catch (Exception e) {
			throw new UiException(e);
		}
	}
}
  • Line 4: Start the database, the files will be stored under WebContent/WEB-INF/dbfiles.
  • Line 5: Create a table and insert test data if no data exists.

If you want to change the test data, please remove the old data under WebContent/WEB-INF/dbfiles then modify the two methods createDataTable and insertDatas as needed and restart this project.

Load data

Following is the usage of the PartialRenderComposer#getData function

/**
 * Get the raw data from DB for pivot model,
 * based on the fields and data range
 * 
 * @return
 * @throws Exception
 */
private List<List<Object>> getData() throws Exception {
	List<List<Object>> rawData = new ArrayList<List<Object>>();

	String firstLevelRow = _extCtrl.getRowFieldList().get(0);
	if (_pageNodesLimit > 0) { // use custom paging
		// data range start index
		int first = _currentPage*_pageNodesLimit;
		String[] rows = getDistinctRowsInPage(firstLevelRow, first,
					first + _pageNodesLimit/* data range end index */);
		rawData = getDataInRange(firstLevelRow, rows);
	} else { // let pivot table handle paging itself
		rawData = getAllData();
	}
	return rawData;
}

When using custom paging to load data partially:

  • Line 11: Get the field name (the column name in database) of first-level row field.
  • Line 15: Get the possible values of first-level row field within the specified page.
  • Line 17: Get all data values received at Line 15.

When loading all data and let pivottable handle paging by itself:

  • Line 19: Simply get all data from database.

Demo

As you can see, in the first part of the demo we loaded all 600,000 records, more than 1GB to the model of pivottable and it took about 10 seconds to render the pivottable. We then applied the custom paging according to the method we introduced in this article, as a result, both the memory consumption and rendering speed are greatly improved.

Download

Sample war:

https://github.com/benbai123/ZK_Practice/blob/master/Components/projects/Addon_Practice/PivottableTest/PivottableWithHugeData.war


Comments



Copyright © Potix Corporation. This article is licensed under GNU Free Documentation License.