Concurrent Ada Programming
Ada’s model for doing concurrent programming is absolutely marvelous, and today I’m going to give you a small taste of it. More or less all programming languages provide tools for concurrent programming, but few do it as elegant as Ada, where the whole concept has been build into the language from it’s inception in 1983.
In Ada concurrent programming is a first class citizen, not an afterthought, and in this day and age where even the cheapest of processors have multiple cores, being able to properly harness the power of those many cores is important. A lot of programs could benefit from a bit of concurrent magic, but since concurrent programming brings with it a whole new set of problems, many programmers shy away from it. Not so with Ada programmers. We have tasks, protected objects, scheduling, guards and entries right there in front of us. In this article I’m going to show you how tasks and protected objects can be used to add concurrency to a very simple program.
The first thing we’re going to do is setup a small program that solves the plain problem of counting 20 times to 1_000_000_000.
One possible Ada solution to this problem is this:
with Ada.Task_Identification; with Ada.Text_IO; procedure Sequential is use Ada.Task_Identification; use Ada.Text_IO; Id : constant String := Image (Current_Task); Jobs : Positive := 20; begin for Job in reverse 1 .. Jobs loop for K in 1 .. 1_000_000_000 loop null; -- This is really hard! end loop; Put_Line ("Job" & Positive'Image (Job) & " done by " & Id); end loop; end Sequential; |
Before we get into actually compiling and running Sequential, allow me to first explain what’s going on in the program.
First we with the two packages Ada.Task_Identification and Ada.Text_IO. What this does is make the subprograms and types from those packages visible and available to our program, somewhat analogous to the #include directive used in C. After that we setup our “main” procedure and name it Sequential. In Ada you can call your main procedure whatever you like. The use clauses we encounter next lets us access the tools of the Ada.Text_IO and Ada.Task_Identification packages without having to prefix every type and subprogram with the package name. This is equivalent to the using namespace statement in C++.
The Ada.Task_Identification package makes it possible for us to figure out the id of a running task and Ada.Text_IO makes the Put_Line procedure available to Sequential.
We declare and initialize two objects in the declarative part of the program (between is and begin): Id and Jobs. The Id constant contains a String representation of the environment task identifier. As our program only have one running task (or thread if you will), which is the main environment task, the output done in the Put_Line line will show the same Id for every job completed.
The Jobs variable is the queue of jobs, in our case 20. Note that the Positive type is actually a subtype of Integer with the range 1 .. Integer'Last. If you try to assign a value below 1 or above Integer'Last then the program will raise a Constraint_Error exception.
After the begin keyword we’re in the actual body of the program, and the first thing we do is setup a for loop that’ll count down from 20 to 1, as indicated by the reverse keyword and the given range of 1 .. Jobs.
Inside this loop we do the actual work: Counting to 1_000_000_000 for each job and then outputting a line of text indicating whenever a job has been completed and by which task id.
And that’s it.
In order to compile the program do:
gnatmake sequential.adb |
And then to execute and time:
time sequential |
My laptop yields the following when executed and timed:
thomas@t420:~/Ada/Simple_Tasking time sequential Job 20 done by main_task_000000000064A010 Job 19 done by main_task_000000000064A010 Job 18 done by main_task_000000000064A010 Job 17 done by main_task_000000000064A010 Job 16 done by main_task_000000000064A010 Job 15 done by main_task_000000000064A010 Job 14 done by main_task_000000000064A010 Job 13 done by main_task_000000000064A010 Job 12 done by main_task_000000000064A010 Job 11 done by main_task_000000000064A010 Job 10 done by main_task_000000000064A010 Job 9 done by main_task_000000000064A010 Job 8 done by main_task_000000000064A010 Job 7 done by main_task_000000000064A010 Job 6 done by main_task_000000000064A010 Job 5 done by main_task_000000000064A010 Job 4 done by main_task_000000000064A010 Job 3 done by main_task_000000000064A010 Job 2 done by main_task_000000000064A010 Job 1 done by main_task_000000000064A010 real 0m52.737s user 0m52.636s sys 0m0.001s
As you can see it took a good 52 seconds to finish all 20 jobs. At no point during the execution of the program did the system utilize more than one CPU core. Each job was completed in an absolute sequential manner, starting with job 20 and ending with 1.
Surely this is sad, considering my laptop is equipped with a CPU sporting a whooping four cores. Lets add some tasking magic to the program and see how much faster we can make it go.
with Ada.Task_Identification; With Ada.Text_IO; procedure Concurrent is use Ada.Task_Identification; use Ada.Text_IO; protected Jobs is procedure Get (Job : out Natural); private J : Natural := 20; end Jobs; ------------ -- Jobs -- ------------ protected body Jobs is ----------- -- Get -- ----------- procedure Get (Job : out Natural) is begin Job := J; if J > 0 then J := J - 1; end if; end Get; end Jobs; task type Worker; -- The Worker gets a job, and does the hard work. -- When there are no more jobs, the Worker exits. -------------- -- Worker -- -------------- task body Worker is A_Job : Natural; Id : constant String := Image (Current_Task); begin loop Jobs.Get (A_Job); exit when A_Job < 1; for K in 1 .. 1_000_000_000 loop null; -- This is really hard! end loop; Put_Line ("Job" & Natural'Image (A_Job) & " done by " & Id); end loop; end Worker; Workers : array (1 .. 4) of Worker; begin null; -- We're not using the main environment task for anything. end Concurrent; |
Let’s dive in!
We with and use the same packages as before, but after that there are quite a few new things, first of which is a protected object. In Ada these are objects that export procedures, functions and entries that enables us to interact with the encapsulated data structure. In our case we’ve setup a protected object called Jobs, but we could just as well have created a protected type and then declared one of more objects to be of that type.
We can only operate on the data structure of a protected object via the exported subprograms and this is done under automatic mutual exclusion. A protected object will allow many concurrent readers (via exported functions), but only one writer (via exported procedures and entries).
A protected object is just what we need to protect our job queue from race conditions, now that we have several tasks querying and updating it.
protected Jobs is procedure Get (Job : out Natural); private J : Natural := 20; end Jobs; ------------ -- Jobs -- ------------ protected body Jobs is ----------- -- Get -- ----------- procedure Get (Job : out Natural) is begin Job := J; if J > 0 then J := J - 1; end if; end Get; end Jobs; |
First we declare our Jobs protected object. We export the sole Get (Job : out Natural) procedure, which is the only way for us to interact with the very simple data structure, consisting only of the variable J.
In the body of Jobs we find the actual implementation of the Get procedure. All it does is put the current value of J into the Job parameter and then decrease the value of J by one. With this simple system in place we’re ensured that access to J is only ever granted to one single task at the same time. Only when that task is done is the next one allowed access.
With a secure job queue in place we move on to creating the tasks that will be doing the actual work:
task type Worker; -- The Worker gets a job, and does the hard work. -- When there are no more jobs, the Worker exits. -------------- -- Worker -- -------------- task body Worker is A_Job : Natural; Id : constant String := Image (Current_Task); begin loop Jobs.Get (A_Job); exit when A_Job < 1; for K in 1 .. 1_000_000_000 loop null; -- This is really hard! end loop; Put_Line ("Job" & Natural'Image (A_Job) & " done by " & Id); end loop; end Worker; |
First we declare a task type called Worker. We could’ve setup a task object, just as we did with the Jobs protected object, but since we want more than one worker a type is the way to go. The body of the task looks a lot like the body from the Sequential program, with a few minor differences. We declare the A_Job variable to be of the type Natural, which is a subtype of Integer with the range 0 .. Integer'Last. The Id constant is exactly like the one from the first program.
In the body of the Worker task we’ve got a plain loop that starts out with putting a job into the A_Job variable. We then check if the job is valid in the exit when... line and if not we exit the loop and the task is then completed.
The for loop and the Put_Line code is more or less the same as before.
Last we have this:
Workers : array (1 .. 4) of Worker; begin null; -- We're not using the main environment task for anything. end Concurrent; |
Here we declare an array of 4 Worker tasks. As soon as the program reach this spot, four workers are created and when we pass begin they spring to life. Since we already have 4 workers counting to 1_000_000_000 like there’s no tomorrow, we don’t really need to add any code to the main environment task, so we simply add a null statement. The main environment task is the master of the 4 workers, so it wont complete until all 4 workers have completed.
Lets see how the program performs now:
thomas@t420:~/Ada/Simple_Tasking$ time concurrent Job 20 done by workers(4)_000000000065FCC0 Job 19 done by workers(2)_0000000000659240 Job 18 done by workers(3)_000000000065C780 Job 17 done by workers(1)_0000000000655D00 Job 16 done by workers(4)_000000000065FCC0 Job 15 done by workers(2)_0000000000659240 Job 13 done by workers(1)_0000000000655D00 Job 14 done by workers(3)_000000000065C780 Job 12 done by workers(4)_000000000065FCC0 Job 10 done by workers(1)_0000000000655D00 Job 11 done by workers(2)_0000000000659240 Job 9 done by workers(3)_000000000065C780 Job 6 done by workers(2)_0000000000659240 Job 8 done by workers(4)_000000000065FCC0 Job 7 done by workers(1)_0000000000655D00 Job 5 done by workers(3)_000000000065C780 Job 4 done by workers(2)_0000000000659240 Job 3 done by workers(4)_000000000065FCC0 Job 1 done by workers(3)_000000000065C780 Job 2 done by workers(1)_0000000000655D00 real 0m18.727s user 1m11.454s sys 0m0.025s
YAY! It is much faster now, and all four cores on my CPU is running at 100% while the program executes. Also note how the jobs are no longer completed sequentially. You can experiment with the amount of workers by changing the size of the Workers array. On my box the sweet spot is firing up 20 workers at the same time. With twenty workers I’m very close to a clean 18 seconds. The second fastes result is obtained with 4 workers. Every other amount of workers I’ve tried is slower.
There’s of course a whole lot more to Ada and concurrent programming than shown here. This little example has barely scratched the surface. A good place to start is the Ada Programming Wikibook.
You can grab the code for these examples at github.com/ThomasLocke/Simple_Tasking.
Adding concurrency to an Ada program is very simple, and the model for doing it is both easy to understand and to work with. When you’re using Ada there’s simply no excuse for not adding concurrency to your program where it will benefit from it.
Comments
Hello,
there is something I don’t understand with your second example:
it seems that in the end, J gets lower than 0, so a Constraint_Error is raised in procedure Get, but no error message is displayed.
So if I understand well, the line
exit when A_Job < 1;
is useless because anyway the task exit because of Constraint_Error.
Why?
Hey Apoptose,
Try using this as the body of Get:
begin
Put_Line (“before: ” & J’Img);
Job := J;
J := J – 1;
Put_Line (“after: ” & J’Img);
end Get
You’ll see that J never goes below zero.
But you’re right, it’s not really pretty. I’ll fix it to make it more clear. Thanks for pointing it out.
edit: Blarg.. Can’t even read my own output any longer. It does go below 0 – sheesh. Fix on its way!
There we go! Thanks to Apoptose for catching that nasty little bug.
Prior to the fix I was completely abusing the fact that a task can die silently if exceptions aren’t handled. The worst part about that is that it is described here:
http://www.adaic.org/resources/add_content/standards/05rat/html/Rat-1-3-4.html
And I read that a short while ago when I started mucking around with the Ada.Task_* packages. My memory is failing me!
No problemo, I’m glad I understand better how the code works now.
I know you have done some Haskell, do you plan to write some kind of comparison between concurrent programming in Ada and in Haskell?
Bye.
I know there’s a book on its way about concurrent programming in Haskell:
https://plus.google.com/u/0/107890464054636586545/posts/QTzpXhc8r8f
http://www.haskell.org/pipermail/haskell/2012-May/023328.html
I do plan on buying that book, but whether or not I’ll ever reach a level of Haskell competence to be able to compare its features to Ada I honestly don’t know.
What I do know is that both Ada and Haskell are awesomely nice languages.
I just made a similar implementation of dispatching jobs in parallel.
See code at http://shootout.alioth.debian.org/u64q/program.php?test=regexdna&lang=gnat&id=3
in procedure Parallel_Count.
There is no need of the package Ada.Task_Identification;
After completion of job, the returning worker gives its job_nbr to the dispatcher.
Improvement in speed on a 4-core is limited to a factor of 3 wrt to single core.
9 jobs => 2 rounds with 4 cores + 1 round with 1 core. = 3
Hey Francois,
That’s some pretty awesome code you got going there! Thanks for linking it.
I used the Ada.Task_Identification package because it clearly shows that different tasks have different internal id’s, and it does so in a very beginner friendly way.
Also it’s nice to learn about the existence of the Ada.Task_* packages. They are very handy.
Hi Thomas,
Neat! I find the scheme clea, the example instructive,
and well explained.
Can I make some comments about the choice of identifiers?
They are jobs’ numbers, though, as the
That’s from the perspective of a reader, totally subjective, of course.
It reflects what had me a little confused at first, looking at the source
(not the explanation). For example, there are:
Jobs (Positive), Job (Natural), Jobs (protected), A_Job (Natural),
Job as an immutable loop variable. And none of these is, well,
if I may, a job (task).
output indicates.
This might all appear nitpicky, I guess. I always found good names helpful
for understanding source from source, which is why…
The following is just what came to my mind after reading from top to bottom,
and back. It’s not placed here to suggest replacements or anything,
but to back up my nagging with something constructive. Not necessarily
good, but hopefully constructive.
subtype Job_Number is Natural;
What does the protected object do? Looking at the source, it has
“Jobs.Get”, so I’d have thought that the caller gets a job. It does
get a job’s number, just like a runner gets one, or like someone
waiting in some public office gets one. Hence, maybe,
protected Dispenser …;
protected Bowl …; — or something less pictorial or …
Then, de-Generalize the name “Get” so that the name of this
procedure conveys some of what to expect when calling it:
Dispenser.Next_Number (My_Number);
Bowl.Next (My_Current_Number);
Jobs.Assign (My_Id);
where the out actual parameters’ names will designate the
job number variables declared in the task bodies.
Hope you don’t mind a somewhat missionary rant.
Georg
Georg,
Awesome. Let me repeat that: Awesome. Thank you very much for your constructive advice. I’m so utterly bad at coming up with names for identifiers, so having someone actually giving me some solid suggestions is a real boon.
I’ll look into updating the code/examples to reflect the thoughts from your “missionary rant”.
Someone should write an article on good naming schemes for identifiers. hint hint.
[...] in may I wrote an article called Concurrent Ada Programming. It was meant as an introduction to the wonderful world of Ada tasking and protected objects, and [...]
What happened to the distinction between parallelism and concurrency?
I’m not sure I understand your question Janus. Did something happen with the distinction?
To me concurrency is >= 2 threads are making progres (not necessarily in parallel), whereas parallelism is when >= 2 threads are making progress at the _same time_.
One could say that Ada have awesome tools for concurrency, and these tools might result in a program where the threads execute in parallel.
Or am I missing something?