Friday, July 18, 2008

Different Computing Grid Architectures and Political Systems

Today I'd like to point out some grid architectures or topologies and some differences between them and give the "political" name to each type.

Pure Master-Worker(Authoritarianism)

This type of Grid includes two different grid nodes groups: master nodes and worker ones.
Benefits:
  • Optimized nodes communication traffic.
  • Simple and native concept.
I would probably point to JPPF framework as the example because their diagram shows that nodes do not have direct connection and JPPF server works like a master node (http://www.jppf.org/presentation.php?current=5). But I think JPPF people can argue here ;)

One could have as many master nodes as needed but master and worker have some differences:
  • Master node CAN publish tasks/jobs to execute, Worker node on the contrary is intended to process assigned tasks/jobs and thus CAN NOT publish tasks/jobs and does not have extra logic to handle task/jobs result.
  • Master node usually sees all workers it need to assign task/jobs to. Worker node in opposite to master one usually does not see anyone else and thus cannot reassign task/jobs to another workers or even communicate with the others.

Of course there are some Grids that gives you extra features like worker nodes communication but there are always masters and workers.

Publisher - subscriber (anarchism)

This Grid uses centralized cache (it can be local or distributed - does not matter). Usually they publish some data with well known marks for the processing. Nodes that have corresponding processing modules/code installed pick up unprocessed data from cache and handle them. After that they return processing results back in cache and mark them as "processed" somehow.

The good example is a GigaSpaces product made on the top of distributed cache.

Benefits:
  • Transparent load balancing - nodes pick up data as soon as they complete processing previous data pack.
  • Cache works like a queue and since it is distributed one (usually) it works fine in LAN as well as WAN.
I can note following:
  • Nodes usually don't see each other. They can use cache to communicate but does not have direct connection.
  • Each node CAN publish new task/jobs and each node can subscribe for the data

Peer Grids (Democracy)

This kind of grid consists of equal nodes. Saying equal I don't mean the same hardware or software (homogeneous environment). They have equal "rights" (This kind of grid is like democracy - everyone can do everything but should not break the law).

Some benefits:
  • Since all nodes can publish tasks/jobs and they see each other they can easily reassign work from one node to another even in runtime (so called late balancing or work stealing).
  • Jobs can have dependences and communicate each other if they need or even wait for each other.
And again the differences:
  • Nodes are equal and CAN post tasks/jobs.
  • Nodes see each other and can communicate (usually directly).
Unlike grid types above peer grids have some natural limits on their size because of direct communication but they can be solved with the help of "hubs/routers" which work the same way like hardware network routers.

I don't want to say which one is better. Of course every case that needs Grid should be investigated and appropriate solution selected.

Sunday, July 6, 2008

Write once scale everywhere

I think you've seen this slogan pretty often and every grid/clustering framework promise you to scale your application once you write something based on their classes and which would depend on their code.

You know I did not like this solution and thought a lot about such dependences. It's just a way to take money of you. If you decide to change the framework or application server you would probably spend a lot of time to reconfigure the code or even rewrite your code to fit new requirements of another framework and so on.

Let me show you another way. The way that I like the most. The way to scale your Java + Spring application without having any dependences on GridGain framework.

Recently we introduced "executor service" feature that allows you to run your code remotely and which I'm going to use and scale.

Here is a typical example of Java code that runs some Callable on executor service. The only difference here is that we use Spring to obtain executor service (which is to my understanding quite flexible).

Typical executor service code

AbstractApplicationContext ctx =
new ClassPathXmlApplicationContext(
"org/gridgain/examples/execservice/spring.xml");

// We register Spring shutdown hook to provide
// automatic beans destruction by Spring.
ctx.registerShutdownHook();

// Get Grid from Spring.
ExecutorService exec = (ExecutorService)ctx.getBean(
"myExecutorService");

Future<String> future = exec.submit(new MyCallable());

...

String res = future.get();

Typical Callable

private static final class MyCallable implements
Callable<String>, Serializable {
public String call() throws Exception {
...
}
}

Note that MyCallable class implements Serializable which is obviously a requirement if you are going to send it to another computer over the network.

That's it. There is no references to any Grid framework or application servers. Java + Spring as I promised.
And obviously it is scalable within single Java VM. Now let's do the trick and make it scalable on the Grid by configuring another executor service in spring.xml. Starting from 2.1.0 (with new feature Grid Spring bean) which should be pushed into production in 2-3 weeks this XML will look as following:

Spring.xml with grid bean and executor service

<bean id="mySpringBean" class="org.gridgain.grid.GridSpringBean"
scope="singleton">
<property name="configuration">
<bean id="grid.cfg"
class="org.gridgain.grid.GridConfigurationAdapter"
scope="singleton"/>
</property>
</bean>

<bean id="myExecutorService" factory-bean="mySpringBean"
factory-method="newGridExecutorService" scope="singleton"/>
And now ladies and gentleman it works on the grid. Simplicity and elegance. That's what we'd like to give our customers and that's what we always think about when write new features. This is customer driven development.