
Summary: Gridgain is advanced software middleware that combines compute and data grid capabilities to provide a framework for processing large data sets. Gridgain implements a well known parallel design pattern developed by Google, called MapReduce. It’s zero deployment feature allows it to be used for building high performance cloud applications. In honor of the presidential election, a fictitious Vote counting problem is used to showcase the distributive computing features in Gridgain. This article is the first of a series that will explore various characteristics of Gridgain.
Prerequisites: If you would like to obtain this article’s complete sample, it may be obtained from our GitHub repository. All samples are Maven based java projects. In addition, this article will require that you have Gridgain installed.
The following steps are required to use Maven to execute the sample Gridgain project:
1. Set GRIDGAIN_HOME environment variable to path of Gridgain installation.
On Windows: set GRIDGAIN_HOME=C:\netmilleRoot\tools\gridgain-4.3.1e-computegrid
2.Manually install the gridgain-4.3.1e.jar into your local Maven repository. The gridgain-4.3.1e.jar is located in the root Gridgain installation folder.
mvn install:install-file -Dfile=gridgain-4.3.1e.jar -DgroupId=org.gridgain -DartifactId=gridgain -Dversion=4.3.1e -Dpackaging=jar
Lets Get Started: GridTask and GridJob interfaces are the two major abstractions within Gridgain. GridTask represents a major unit of work, while GridJob represents a sub task. Also, GridTask is responsible for dividing the unit of work into GridJobs, mapping the GridJobs on to available compute nodes, and aggregating results from GridJobs.
The following steps further describe the MapReduce algorithm in Gridgain:
- A task (GridTask) is split into subtasks called GridJobs
- Next, the GridJobs are mapped and shipped to various nodes (compute resources) for parallel processing
- Upon completion, results of GridJobs are returned.
- All results from GridJobs are aggregated by GridTask into a final result.
Counting Votes with Gridgain: Given a random population of 10,000 votes (Vote), we use Gridgain to determine the winning party. In this problem, a Vote can either be cast for Republican (Party.REPUBLICAN) or Democrat (Party.DEMOCRAT) .
In order to implement this usecase, we will split our population of 10,000 votes (Vote) into a list of sub lists. Next, we assign each sub list of votes to a VoteCounterGridJob to be calculated on an available compute resource. The Gridgain middleware utilizes advanced load balancing features to ship VoteCounterGridJobs to available nodes for processing. Once each VoteCounterGridJob has counted it’s respective votes, the results are returned to the VoteCounterGridTask to be aggregated into a final result.
NOTE: Gridgain relys on Spring Framework’s IOC architecture to enable customization of nearly every aspect of it’s functional behavior. We will see examples of this feature in future articles.
Please refer to Listing 1: VoteCounterGridTask for the following explanation:
In line 18, Our VoteCounterGridTask extends GridTaskSplitAdapter. This type of GridTask relies on Gridgain infrastructure to map VoteCounterGridJobs onto available compute resources.
In line 20, Gridgain will invoke our split() method. In this routine, we split the population of Vote objects into a list of lists. Our list of Vote objects are assigned to VoteCounterGridJobs. Once this method is called, Gridgain will internally ship our VoterCounterGridJobs to available nodes.
In line 35, Gridgain will invoke our reduce() method. In this routine, we aggregate the results returned from our VoteCounterGridJobs into a final result (VoteResult).
Listing 1: VoteCounterGridTask.java
package techbysample.gridgain4.sample1;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import org.gridgain.grid.GridException;
import org.gridgain.grid.GridJob;
import org.gridgain.grid.GridJobResult;
import org.gridgain.grid.GridTaskSplitAdapter;
/**
*
* @author TechBySample.com
*
*/
public class VoteCounterGridTask extends GridTaskSplitAdapter, VoteResult> {
protected Collection split(int gridSize, List votes) throws GridException {
List> dividedVotes = divide(votes,50);
List jobs = new ArrayList(dividedVotes.size());
for (List _votes: dividedVotes)
{
jobs.add(new VoteCounterGridJob(_votes));
}
return jobs;
}
public VoteResult reduce(List results) throws GridException {
int democrat=0;
int republican=0;
for (GridJobResult result: results)
{
VoteResult voteResult= result.getData();
democrat = democrat + voteResult.getResults(Party.DEMOCRAT);
republican= republican + voteResult.getResults(Party.REPUBLICAN);
}
VoteResult _voteResult = new VoteResult();
_voteResult.setResults(Party.DEMOCRAT, democrat );
_voteResult.setResults(Party.REPUBLICAN, republican );
return _voteResult;
}
public static List> divide(List list, int size)
throws NullPointerException, IllegalArgumentException {
if (list == null) {
throw new NullPointerException("The list parameter is null.");
}
if (size throw new IllegalArgumentException(
"The list size parameter must be more than 0.");
}
int num = list.size() / size;
int mod = list.size() % size;
List> ret = new ArrayList>(mod > 0 ? num + 1 : num);
for (int i = 0; i < num; i++) { ret.add(list.subList(i * size, (i + 1) * size)); } if (mod > 0) {
ret.add(list.subList(num * size, list.size()));
}
return ret;
}
}
Unit Testing: A JUnit testcase (VoteCounterGridTest) is utilized to demonstrate Gridgain’s distributive computing behavior.
Please refer to Listing 4: VoteCounterGridTest.java for the following explanation:
In line 25, the initialize() method is used to start the Gridgain runtime.
In line 41, the testCountVotes() method is used to generate a random population of votes.
In line 54, Gridgain provides Grid object to execute our VoteCounterGridTask using votestobeCounted as an input parameter
In line 57, a GridFuture object is used to retrieve the final result (VoteResult).
Listing 2: VoteCounterGridTest.java
package techbysample.gridgain4.sample1;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import org.gridgain.grid.Grid;
import org.gridgain.grid.GridTaskFuture;
import org.gridgain.grid.typedef.G;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
/**
*
* @author TechBySample.com
*
*/
public class VoteCounterGridTest {
private Grid grid = null;
@Before
public void initialize() {
try{
G.start();
grid = G.grid();
}
catch(Exception e)
{
System.out.println(e);
}
}
@Test
public void testCountVotes()
{
Party[] parties = {Party.DEMOCRAT,Party.REPUBLICAN};
List votesTobeCounted = new ArrayList();
Random randomGenerator = new Random();
for (int i=0;i {
int randomInt = randomGenerator.nextInt(2);
votesTobeCounted.add(new Vote(parties[randomInt]));
}
try{
// Execute task.
GridTaskFuture future = grid.execute(VoteCounterGridTask.class, votesTobeCounted);
// Wait for task completion.
VoteResult result = future.get();
System.out.println("Democrat vote count=" + result.getResults(Party.DEMOCRAT));
System.out.println("Republican vote count=" + result.getResults(Party.REPUBLICAN));
if (result.getResults(Party.DEMOCRAT) == result.getResults(Party.REPUBLICAN))
{
System.out.println("We have tie!");
}
if (result.getResults(Party.DEMOCRAT) > result.getResults(Party.REPUBLICAN))
{
System.out.println("We have a Democratic president!");
}
else{
System.out.println("We have a Republican president!");
}
}
catch(Exception e)
{
System.out.println(e);
}
}
@After
public void tearDown()
{
grid=null;
}
}
Running VoteCounterGridTest:
Prior to running the JUnit test, we will start 2 standalone compute nodes to be available for processing our GridJobs.
NOTE: Its worth mentioning, that these nodes are ‘barebone’ nodes with only the Gridgain runtime. Our classes are NOT pre-installed on each JVM node. Its NOT necessary as Gridgain takes care of ‘magically‘ shipping required classes to remote nodes for processing.
Follow these steps:
1. Navigate to your <Gridgain installation>/bin folder and type the startup script corresponding to your os:
ggstart.bat or ggstart.sh
2. Repeat step 1.
3. You should see a display similar to the following:
Node 1:
GridGain Command Line Loader, ver. 4.3.1e.10112012
2012 Copyright (C) GridGain Systems
[07:55:15] _____ _ _______ _
[07:55:15] / ___/____(_)___/ / ___/___ _(_)___
[07:55:15] / (_ // __/ // _ / (_ // _ `/ // _ \
[07:55:15] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/
[07:55:15]
[07:55:15] —==++ IN-MEMORY BIG DATA ++==—
[07:55:15] ver. 4.3.1e-10112012
[07:55:15] 2012 Copyright (C) GridGain Systems
[07:55:15] Quiet mode.
[07:55:15] ^– To disable add -DGRIDGAIN_QUIET=false or “-v” to ggstart.{sh|bat}
[07:55:15] << Enterprise Edition >>
[07:55:15] Config URL: file:/C:/netmilleRoot/tools/gridgain-4.3.1e-computegrid/config/default-spring.xml
[07:55:15] Daemon mode: off
[07:55:15] Language runtime: Java Platform API Specification ver. 1.6
[07:55:15] JVM name: Java HotSpot(TM) Client VM
[07:55:15] Remote Management [restart: on, REST: on, JMX (remote: on, port: 49123, auth: off, ssl: off)]
[07:55:15] GRIDGAIN_HOME=C:\netmilleRoot\tools\gridgain-4.3.1e-computegrid
. . . .
[07:55:22] OS: Windows Vista 6.0 x86, netmille
[07:55:22] VM name: 65076@netmille-PC
[07:55:22] Local ports used [TCP:47100 UDP:47200 TCP:47300]
[07:55:22] GridGain started OK
Node 2:
GridGain Command Line Loader, ver. 4.3.1e.10112012
2012 Copyright (C) GridGain Systems
[07:58:17] _____ _ _______ _
[07:58:17] / ___/____(_)___/ / ___/___ _(_)___
[07:58:17] / (_ // __/ // _ / (_ // _ `/ // _ \
[07:58:17] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/
[07:58:17]
[07:58:17] —==++ IN-MEMORY BIG DATA ++==—
[07:58:17] ver. 4.3.1e-10112012
[07:58:17] 2012 Copyright (C) GridGain Systems
[07:58:17]
[07:58:17] Quiet mode.
[07:58:17] ^– To disable add -DGRIDGAIN_QUIET=false or “-v” to ggstart.{sh|ba
[07:58:17] << Enterprise Edition >>
[07:58:17] Config URL: file:/C:/netmilleRoot/tools/gridgain-4.3.1e-computegrid/c
[07:58:17] Daemon mode: off
[07:58:17] Language runtime: Java Platform API Specification ver. 1.6
[07:58:17] JVM name: Java HotSpot(TM) Client VM
[07:58:17] Remote Management [restart: on, REST: on, JMX (remote: on, port: 4912
[07:58:17] GRIDGAIN_HOME=C:\netmilleRoot\tools\gridgain-4.3.1e-computegrid
[07:58:17] (wrn) SMTP is not configured – email notifications are off.
[07:58:17] (wrn) Cache is not configured – data grid is off.
[07:58:19] (wrn) Swap space is disabled (to enable use GridLevelDbSwapSpaceSpi).
[07:58:19] Security status [authentication=on, secure-session=on]
[07:58:20] Topology snapshot [nodes=1, CPUs=1, hash=0x517F43E8]
[07:58:20] Node JOINED [nodeId8=d489993f, addr=[192.168.2.7], order=135557971789
[07:58:23] Topology snapshot [nodes=2, CPUs=1, hash=0xE92CA1A6]
. . . .
[07:58:24] GridGain started OK
4. From the ‘gridgain4-sample1′ project directory, type:
mvn -Dtest=VoteCounterGridTest test
On Node 1 ,Node 2, and JUnit Node (where testcase was executed ), you should see several intermediate results from various VoteCounterGridJobs similar to following:
Node 1/ Node 2/JUnit Node:
Local vote results:
DEMOCRAT=28 votes
REPUBLICAN=22 votes
Local vote results:
DEMOCRAT=18 votes
REPUBLICAN=32 votes
5. Finally, when VoteCounterGridTest completes, on JUnit Node you should see a final result:
JUnit Node:
Local vote results:
DEMOCRAT=28 votes
REPUBLICAN=22 votes
Local vote results:
DEMOCRAT=24 votes
REPUBLICAN=26 votes
Democrat vote count=4952
Republican vote count= 5048
We have a Republican president!
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 42.821 sec
Results :
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[18:02:40] GridGain stopped OK [uptime=00:00:29:350]
[INFO] ————————————————————————
[INFO] BUILD SUCCESS
[INFO] ————————————————————————
Resources:
Multicore Cloud Applications with Gridain and Amazon Web Services







