Most of the people "know" or may be "feel" the ways to parallelize their code, but anyway I see pretty much questions about it. In general they look like "How can I execute my code in parallel?" or "What should I do with my code?".
So I think it would be a good idea to give some hints to make it simpler for everyone.
First of all I would like to notice that I'm not going to give you comprehensive or deep knowledges about code parallelization. If you need them you should better read some books and articles or study it at university :) What I'd like to give you in this post is just a general approaches which you can apply to your code to make it more robust and effective.
Two simple and basic ways to execute your code in parallel are:
1. Postponed data.
Lets take a look at the abstract code:
handleSomeData(data);
Variable data calculated at the very beginning used later after some other calculations and thus can be calculated in parallel with doAnythingElse() method. Simplest way is to use executor service for this (if we are talking about Java) or any other appropriate manner (depends on the programming language).
Of course you can argue and exchange first two lines of code like this:
SomeData data = calculateSomeData();
handleSomeData(data);
And of course you will be right, but what if doAnythingElse() method waits for something (some external data) or uses network or any other "slow" devices? In this case CPU load will be really low and you will waste it. That's why it's much better to execute code in parallel.
2. Loops.
Everyone used "for" or "while" loops when he/she coded and hardly ever thought about their parallelization. But executing these loops in parallel is a really good way to speed up your code. One should understand that it makes sense if loop body takes long enough (not just 5 microseconds ;)).
Typical for-loop is
And usually we know number of loops to be executed. So we can execute for-loop-body in parallel known number of times and then "merge" execution results into the final loop result.
There are some issues you may run into and one of then is that loop-bodies may depend on each other. This happens pretty often and if next execution at the very beginning requires results of the previous one then this loop cannot be executed in parallel. But if next execution needs previous results later (say in the middle of the execution) then these are "connected jobs" and they still can be executed in parallel.
To sum it up:
Execute in parallel your loops and calculate all data in parallel wherever it's possible but always keep in mind that parallelization has some overheads and do not parallelize "short" calculations.
So I think it would be a good idea to give some hints to make it simpler for everyone.
First of all I would like to notice that I'm not going to give you comprehensive or deep knowledges about code parallelization. If you need them you should better read some books and articles or study it at university :) What I'd like to give you in this post is just a general approaches which you can apply to your code to make it more robust and effective.
Two simple and basic ways to execute your code in parallel are:
- Postponed data
- Loops
1. Postponed data.
Lets take a look at the abstract code:
SomeData data = calculateSomeData();
doAnythingElse(doItLong);
handleSomeData(data);
Variable data calculated at the very beginning used later after some other calculations and thus can be calculated in parallel with doAnythingElse() method. Simplest way is to use executor service for this (if we are talking about Java) or any other appropriate manner (depends on the programming language).
Of course you can argue and exchange first two lines of code like this:
doAnythingElse(doItLong);
SomeData data = calculateSomeData();
handleSomeData(data);
And of course you will be right, but what if doAnythingElse() method waits for something (some external data) or uses network or any other "slow" devices? In this case CPU load will be really low and you will waste it. That's why it's much better to execute code in parallel.
2. Loops.
Everyone used "for" or "while" loops when he/she coded and hardly ever thought about their parallelization. But executing these loops in parallel is a really good way to speed up your code. One should understand that it makes sense if loop body takes long enough (not just 5 microseconds ;)).
Typical for-loop is
for (condition)
for-loop-body
And usually we know number of loops to be executed. So we can execute for-loop-body in parallel known number of times and then "merge" execution results into the final loop result.
There are some issues you may run into and one of then is that loop-bodies may depend on each other. This happens pretty often and if next execution at the very beginning requires results of the previous one then this loop cannot be executed in parallel. But if next execution needs previous results later (say in the middle of the execution) then these are "connected jobs" and they still can be executed in parallel.
To sum it up:
Execute in parallel your loops and calculate all data in parallel wherever it's possible but always keep in mind that parallelization has some overheads and do not parallelize "short" calculations.

