The following is a demonstration of the threaded.d script,
Here we run a test program called "cputhread" that creates 4 busy threads
that run at the same time. Here we run it on a server with only 1 CPU,
# threaded.d
2005 Jul 26 02:56:37,
PID: 8516 CMD: cputhread
value ------------- Distribution ------------- count
1 | 0
2 |@@@@@@@ 17
3 |@@@@@@@@@@@ 28
4 |@@@@@@@@@@@ 27
5 |@@@@@@@@@@@ 28
6 | 0
[...]
In the above output, we can see that cputhread has four busy threads with
thread IDs 2, 3, 4 and 5. We are sampling at 100 Hertz, and have caught
each of these threads on the CPU between 17 and 28 times.
Since the above counts add to 100, this is either a sign of a single CPU
server (which it is), or a sign that a multithreaded application may be
running "serialised" - only 1 thread at a time. Compare the above output
to a multi CPU server,
Here we run the same test program on a server with 4 CPUs,
# threaded.d
2005 Jul 26 02:48:44,
PID: 5218 CMD: cputhread
value ------------- Distribution ------------- count
1 | 0
2 |@@@@@@@@@@@ 80
3 |@@@@@@@@@@ 72
4 |@@@@@@@@@ 64
5 |@@@@@@@@@@@ 78
6 | 0
[...]
This time the counts add to equal 294, so this program is definitely
running on multiple CPUs at the same time, otherwise this total would
not be beyond our sample rate of 100. The distribution of threads on CPU
is fairly even, and the above serves as an example of a multithreaded
application performing well.
Now we run a test program called "cpuserial", which also create 4 busy
threads, however due to a coding problem (poor use of mutex locks) they
only run one at a time,
# threaded.d
2005 Jul 26 03:07:21,
PID: 5238 CMD: cpuserial
value ------------- Distribution ------------- count
2 | 0
3 |@@@@@@@@@@@@ 30
4 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 70
5 | 0
[...]
The above looks like we are back on a single CPU server with 100 samples
in total, however we are still on our 4 CPU server. Only two threads have
run, and the above distribution is a good indication that they have
run serialised.
Now more of a fringe case. This version of cpuserial again creates 4 threads
that are all busy and hungry for the CPU, and again we run it on a 4 CPU
server,
# threaded.d
2005 Jul 26 03:25:45,
PID: 5280 CMD: cpuserial
value ------------- Distribution ------------- count
1 | 0
2 |@@@@@@@@@@@@@@@ 42
3 |@@@@@@@@@@@@@@@@@@ 50
4 |@@@@@@ 15
5 |@ 2
6 | 0
[...]
So all threads are running, good. And with a total of 109, at some point
more than one thread was running at the same time (otherwise this would
not have exceeded 100, bearing in mind a sample rate of 100 Hertz). However,
what is not so good is that with 4 CPUs we have only scored 109 samples -
since all threads are CPU hungry we'd hope that more often they could
run across the CPUs simultaneously; however this wasn't the case. Again,
this fault was created by poor use of mutex locks.