Wednesday, June 4, 2008

Result? No cache.

OutOfMemoryError is very critical when you processing large data on the grid.

What if you split up all you jobs among other boxes in the network and every box produced result of 100K? It's OK if we are talking about hundred job and if you have enough memory installed on those one that will collect all execution results and process them.

But what would happen if there are 10000 nodes in the grid and every node sends back 1M of data. I know that in most cases this is rather a hypothetical issue but as usually there are some people who always ask you "what if...".

I know that normally when you build the grid you have to take it into account and avoid sending such data back. I'm sure that in 90 cases of 100 you will never send back more than 100K and even if you know that result is 1M at maximum you should set up as much memory on "master" node as you can get back. So in our case described above it is 10K (number of nodes) * 1M (maximum result size) = 10G. Not so much (taking into account 10K nodes :)).

But anyway let's be lazy and instead of thinking about our grid we would better waste out money and time and rely on Grid product that will probably solve it somehow ;)

Different products gives you different ways to handle the case. We are at GridGain solved it as following: instead of parallel results processing we do it sequentially without caching received data and delete them as soon as they were processed. Of course this will work only if your final result does not depend on all interim results received from remote nodes (that's why I said waste time as money above - 8G of additional memory costs a few).

GridGain product uses annotations throughout the code and this issue had been solved with @GridTaskNoResultCache annotation like below:


GridResultNoCacheTask.java

@GridTaskNoResultCache
public class GridResultNoCacheTask extends GridTaskSplitAdapter<String, Object> {
@Override
public GridJobResultPolicy result(GridJobResult result,
List<GridJobResult> received) throws GridException {
assert result.getData() != null;
assert received.contains(result) == true;

// Do something with received result. The rest
// (in "received" list) are null;
}
}

0 comments: