There are two different approaches of job scheduling on the grid.
First approach is choosing the most suitable node for the job execution or another words load balancing. Prior to sending job on remote node one have to make a decision which node has resources for the job execution. The strategy could be different from simple round-robin implementation to affinity load balancing that selects the node where processing data are located on. But one should take into account that everything can be changed when job comes on the node.
GridGain ships some implementations out-of-the-box and at the same time give simple way to implement your own load balancer by creating "probes".
Probe implements GridAdaptiveLoadProbe interface and is in change of returning back load for the given node. Here is an example of CPU probe that returns node load based on current CPU one.
GridAdaptiveCpuLoadProbe.java
"Simply clever" as Skoda says. And configuration file excerpt:
config.xml
Another approach is a runtime job scheduling or as we call it in GridGain collision resolution. Every new job collide with the others when comes to the target node. Saying "collide" we don't mean that jobs beat each other somehow :). Collision in this case just means that node should probably take some actions about it.
GridGain has different collision resolutions. One that I have already wrote some posts about is a "priority collision resolution" where all outstanding jobs are ordered according to their priority.
Another one is so called "job stealing". "Job stealing" is a brand-new feature significantly influenced by Java Fork/Join Framework authored by Doug Lea and planned for Java 7. GridGain implementation took similar concepts and applied them to the grid (as opposed to within VM support planned in Java 7). Job stealing allows underloaded node to take some jobs from overloaded node and thus balance grid nodes load automatically during runtime. Developer should not even know about job stealing or do anything special about it.
You need to turn it on to get working and can find description and parameters here:
config.xml
GridGain as the enterprise level Grid supports both load-balancing and collision resolution that makes it very flexible and at the same time easy-to-use.
First approach is choosing the most suitable node for the job execution or another words load balancing. Prior to sending job on remote node one have to make a decision which node has resources for the job execution. The strategy could be different from simple round-robin implementation to affinity load balancing that selects the node where processing data are located on. But one should take into account that everything can be changed when job comes on the node.
GridGain ships some implementations out-of-the-box and at the same time give simple way to implement your own load balancer by creating "probes".
Probe implements GridAdaptiveLoadProbe interface and is in change of returning back load for the given node. Here is an example of CPU probe that returns node load based on current CPU one.
GridAdaptiveCpuLoadProbe.java
public class GridAdaptiveCpuLoadProbe implements GridAdaptiveLoadProbe {
/**
* {@inheritDoc}
*/
public double getLoad(GridNode node, int jobsSentSinceLastUpdate) {
GridNodeMetrics metrics = node.getMetrics();
double k = metrics.getAvailableProcessors();
return (metrics.getCurrentCpuLoad()) / k;
}
}
"Simply clever" as Skoda says. And configuration file excerpt:
config.xml
...
<property name="loadBalancingSpi">
<bean class="org.gridgain.grid.spi.loadbalancing.adaptive.GridAdaptiveLoadBalancingSpi">
<property name="loadProbe">
<bean class="GridAdaptiveCpuLoadProbe">
<constructor-arg value="true"/>
</bean>
</property>
</bean>
</property>
...
Another approach is a runtime job scheduling or as we call it in GridGain collision resolution. Every new job collide with the others when comes to the target node. Saying "collide" we don't mean that jobs beat each other somehow :). Collision in this case just means that node should probably take some actions about it.
GridGain has different collision resolutions. One that I have already wrote some posts about is a "priority collision resolution" where all outstanding jobs are ordered according to their priority.
Another one is so called "job stealing". "Job stealing" is a brand-new feature significantly influenced by Java Fork/Join Framework authored by Doug Lea and planned for Java 7. GridGain implementation took similar concepts and applied them to the grid (as opposed to within VM support planned in Java 7). Job stealing allows underloaded node to take some jobs from overloaded node and thus balance grid nodes load automatically during runtime. Developer should not even know about job stealing or do anything special about it.
You need to turn it on to get working and can find description and parameters here:
config.xml
...
<property name="collisionSpi">
<bean class="org.gridgain.grid.spi.collision.jobstealing.GridJobStealingCollisionSpi">
<property name="activeJobsThreshold" value="100"/>
<property name="waitJobsThreshold" value="0"/>
<property name="maximumStealingAttempts" value="10"/>
<property name="stealingEnabled" value="true"/>
<property name="messageExpireTime" value="1000"/>
</bean>
</property>
...
<property name="failoverSpi">
<bean class="org.gridgain.grid.spi.failover.jobstealing.GridJobStealingFailoverSpi">
<property name="maximumFailoverAttempts" value="5"/>
</bean>
</property>
...
GridGain as the enterprise level Grid supports both load-balancing and collision resolution that makes it very flexible and at the same time easy-to-use.


1 comments:
We originally have a job to load a list of saved transaction requests and target to resubmit to backend when the backend is back from maintenance. You can expect my job need a long time to complete for large number of records, given that most of time is wasted to 'wait' for reply from backend. With GridGain, we can scale the job to run parallel and send message in parallel. But now, we meet another problem is we are sending too fast to backend and request messages are accumulated in queues. Or say, the request sending rate is larger than host reply rate.
Is there any way to implement load balancing control using GridGain which query on current queue length of requesting queue
Post a Comment