Tuesday, October 7, 2008

Hadoop and GridGain. The major difference.

I'm not going to compare products themselves, more of that I'm not going to argue what is better and what is not. Let guys from Hadoop and GridGain spend (waste) their time on that.

What I'm definitely gonna do is to find out the major or principal differences of those two products to give you a chance to choose appropriate one.

1) Unlike the GridGain, Hadoop is initially data processing oriented product. According to their Map/Reduce description (taken from this page http://hadoop.apache.org/core/docs/current/mapred_tutorial.html) they split data set into the small pieces (usually files on HDFS). Thus, their Map/Reduce approach is data oriented.
On the other side is the GridGain. This framework is intended to split computational tasks. So you need to find out the way to split you complicated calculation (task) into the simple pieces named jobs and execute them on the grid.

2) Both of them can process large data set, but Hadoop has underlying HDFS (distributed file system) which works with huge files and is able to carry tens of millions of files. The GridGain rather relies on distributed caches and unlike the Hadoop it has pluggable SPIs to work with different cache implementations.

3) Hadoop has Map/Reduce implementation which is pretty close to the Google one, on the other hand GridGains one to my understanding is more flexible but has some differences with Googles one.

4) Hadoops jobs/tasks are executed as external Java processes with their own configuration. GridGains jobs/tasks are executed in Grid node space (within the node VM).

5) GridGain supports Windows/Linux/MacOs. Hadoop has been tested on Linux as they say on their site.

That's what I'd like to say in this post. I won't make any conclusions. All of them are up to you.
In my next post I will compare GridGain with GigaSpaces (I'll try to learn latest GigaSpaces features to be more objective).

2 comments:

Ashish said...

Great comparisons!

dkharlamov said...

Thanks, I tried to be neutral, but as you can see guys from GridGain and Hadoop gave me +13 and -13 votes correspondingly :)