DTrace shows that readable Perl code is fastest.

I always wondered why some people write unreadable Perl. The most common reason given seems to be ‘Its faster that way’.

And so… using DTrace, and the extra probes (see the subversion repository with a patched Perl 5.8.8) I added, I thought I’d take a look.

# dtrace -l | grep perl
85614   perl1226        libperl.so                      Perl_sv_free del_sv
85615   perl1226        libperl.so                   Perl_sv_replace del_sv
85616   perl1226        libperl.so                          perl_run main_enter
85617   perl1226        libperl.so                        perl_parse main_enter
85618   perl1226        libperl.so                     perl_destruct main_enter
85619   perl1226        libperl.so                    perl_construct main_enter
85620   perl1226        libperl.so                        perl_alloc main_enter
85621   perl1226        libperl.so                          perl_run main_exit
85622   perl1226        libperl.so                        perl_parse main_exit
85623   perl1226        libperl.so                     perl_destruct main_exit
85624   perl1226        libperl.so                    perl_construct main_exit
85625   perl1226        libperl.so                        perl_alloc main_exit
85626   perl1226        libperl.so                       Perl_sv_dup new_sv
85627   perl1226        libperl.so                      Perl_newSVrv new_sv
85628   perl1226        libperl.so                      Perl_newSVsv new_sv
85629   perl1226        libperl.so                  Perl_newRV_noinc new_sv
85630   perl1226        libperl.so                      Perl_newSVuv new_sv
85631   perl1226        libperl.so                      Perl_newSViv new_sv
85632   perl1226        libperl.so                      Perl_newSVnv new_sv
85633   perl1226        libperl.so                    Perl_vnewSVpvf new_sv
85634   perl1226        libperl.so               Perl_newSVpvn_share new_sv
85635   perl1226        libperl.so                     Perl_newSVhek new_sv
85636   perl1226        libperl.so                     Perl_newSVpvn new_sv
85637   perl1226        libperl.so                      Perl_newSVpv new_sv
85638   perl1226        libperl.so                 Perl_sv_newmortal new_sv
85639   perl1226        libperl.so                Perl_sv_mortalcopy new_sv
85640   perl1226        libperl.so                        Perl_newSV new_sv
85641   perl1226        libperl.so                      Perl_pp_sort sub-entry
85642   perl1226        libperl.so                   Perl_pp_dbstate sub-entry
85643   perl1226        libperl.so                  Perl_pp_entersub sub-entry
85644   perl1226        libperl.so                      Perl_pp_last sub-return
85645   perl1226        libperl.so                    Perl_pp_return sub-return
85646   perl1226        libperl.so                     Perl_dounwind sub-return
85647   perl1226        libperl.so                Perl_pp_leavesublv sub-return
85648   perl1226        libperl.so                  Perl_pp_leavesub sub-return

Using these probes, we can write some ‘D’ that tells us what Perl is doing at each of its phases – startup, parsing, execution, and cleanup.

First off, accessing function call parameters:

Given 3 essentially identical programs

#!/usr/local/bin/perl -Tw

use strict;

my $initial = "there once was a fish. Its feet were small";
my $post = func($initial);
print "$post\n";

sub func {
    $_[0] =~ s/there/There/;
    return $_[0];
}
#!/usr/local/bin/perl -Tw

use strict;

my $initial = "there once was a fish. Its feet were small";
my $post = func($initial);
print "$post\n";

sub func {
    my ($val) = @_;
    $val =~ s/there/There/;
    return $val;
}
#!/usr/local/bin/perl -Tw

use strict;

my $initial = "there once was a fish. Its feet were small";
my $post = func($initial);
print "$post\n";

sub func {
    my $val = shift;
    $val =~ s/there/There/;
    return $val;
}

There is a myth that using $_[0] is faster, as it doesn’t create a temporary variable…
Dtrace (using the general perl stats gathering dtrace script) shows this to be untrue:

== call1.pl ==========================================================
  perl*::perl_alloc:main_enter
  perl*::perl_alloc:main_exit,  (0/0) (53119 nS)
  perl*::perl_construct:main_enter
  perl*::perl_construct:main_exit,  (12/0) (564370 nS)
  perl*::perl_parse:main_enter
   --> BEGIN, ./call1.pl
    --> bits, /usr/local/lib/perl5/5.8.8/strict.pm
    <-- bits, /usr/local/lib/perl5/5.8.8/strict.pm (3/2) (48060 nS)
    --> import, /usr/local/lib/perl5/5.8.8/strict.pm
    <-- import, /usr/local/lib/perl5/5.8.8/strict.pm (1/0) (15398 nS)
   <-- BEGIN, ./call1.pl (160/80) (1025874 nS)
  perl*::perl_parse:main_exit,  (299/42) (2856399 nS)
  perl*::perl_run:main_enter
   --> func, ./call1.pl
   <-- func, ./call1.pl (1/0) (47723 nS)
  perl*::perl_run:main_exit,  (0/1) (265677 nS)
  perl*::perl_destruct:main_enter
  perl*::perl_destruct:main_exit,  (0/2) (20763 nS)
total, total (0/0) (3789064 nS)
== call2.pl ==========================================================
  perl*::perl_alloc:main_enter
  perl*::perl_alloc:main_exit,  (0/0) (53251 nS)
  perl*::perl_construct:main_enter
  perl*::perl_construct:main_exit,  (12/0) (509684 nS)
  perl*::perl_parse:main_enter
   --> BEGIN, ./call2.pl
    --> bits, /usr/local/lib/perl5/5.8.8/strict.pm
    <-- bits, /usr/local/lib/perl5/5.8.8/strict.pm (3/2) (36748 nS)
    --> import, /usr/local/lib/perl5/5.8.8/strict.pm
    <-- import, /usr/local/lib/perl5/5.8.8/strict.pm (1/0) (9797 nS)
   <-- BEGIN, ./call2.pl (160/80) (924250 nS)
  perl*::perl_parse:main_exit,  (299/38) (2545953 nS)
  perl*::perl_run:main_enter
   --> func, ./call2.pl
   <-- func, ./call2.pl (1/0) (42165 nS)
  perl*::perl_run:main_exit,  (0/1) (142393 nS)
  perl*::perl_destruct:main_enter
  perl*::perl_destruct:main_exit,  (0/2) (20851 nS)
total, total (0/0) (3301007 nS)
== call3.pl ==========================================================
  perl*::perl_alloc:main_enter
  perl*::perl_alloc:main_exit,  (0/0) (52927 nS)
  perl*::perl_construct:main_enter
  perl*::perl_construct:main_exit,  (12/0) (607783 nS)
  perl*::perl_parse:main_enter
   --> BEGIN, ./call3.pl
    --> bits, /usr/local/lib/perl5/5.8.8/strict.pm
    <-- bits, /usr/local/lib/perl5/5.8.8/strict.pm (3/2) (37066 nS)
    --> import, /usr/local/lib/perl5/5.8.8/strict.pm
    <-- import, /usr/local/lib/perl5/5.8.8/strict.pm (1/0) (10171 nS)
   <-- BEGIN, ./call3.pl (160/80) (924824 nS)
  perl*::perl_parse:main_exit,  (297/37) (2543981 nS)
  perl*::perl_run:main_enter
   --> func, ./call3.pl
   <-- func, ./call3.pl (1/0) (41833 nS)
  perl*::perl_run:main_exit,  (0/1) (140527 nS)
  perl*::perl_destruct:main_enter
  perl*::perl_destruct:main_exit,  (0/2) (20273 nS)
total, total (0/0) (3395310 nS)

allocations / deallocations:
     474 /      122 call3.pl
     476 /      123 call2.pl
     476 /      127 call1.pl

Counting up the number of allocations and deallocations in the (0/1) output – and
“<– func, ./call2.pl (1/0) ” is always the same… one allocation.

After all the test runs, I also print out the total allocations for the script,
and it seems that the “my $val = shift” version is the most efficient –
using two fewer allocations (apparently during the parse phase).

The deallocation count is interesting too – with “$[0]” using 5 more deallocations during
the parse phase and “my ($val) = @
;” using one more than the “my $val = shift” option.

In an attempt to reduce the allocations doesn’t seem to help – the following code resulting in 474 allocations,
shift case, but with 3 extra deallocations, again in the parsing phase. Increasing the number of times that func
is called only increases the benefits of using shift.

#!/usr/local/bin/perl -Tw

use strict;

my $initial = "there once was a fish. Its feet were small";
$_ = $initial;
my $post = func();
print "$post\n";

sub func {
    s/there/There/;
    return $_;
}

Interestingly, “my $val = shift” is not only the fastest of the conventions tested, but it also seems that none of the conventions tested cause allocations at run time – they are all done during the parse phase. I guess I’ll have to construct a more complex case, using references / hashes – next time 🙂

Windows installer of TWiki 4.2 rc2 that uses Strawberry perl 5.10 beta 2

For the extremely adventurous – I have built an installer using Strawberry Perl 5.10 beta2 – TWiki-4.2.0-rc2.1-strawberry.exe

Warning: Search does not work, and needs someone to debug it (I’m away over xmas)

A new begining for Perl and DTrace?

I’ve just created a subversion repository with perl 5.8.8, and the accumulated DTrace patches – including the using is-enabled to reduce the performance impact of the Probes when disabled. Byran and I, (and anyone else that would like to help) will be working slowly towards making Perl a first class DTrace citizen over the coming months. Next stop – Perl Guts Illustrated

Of course, we’ll also port it all to Perl 5.10 – the 20th anniversary release 🙂

DTrace, Perl and TWiki – on Solaris

I’ve been promising myself some time to try out DTrace on TWiki’s codebase for over a year. By following Bryan Allen’s
instructions using Richard Dawe’s adaption of Alan Burlison’s work… I now have a Perl 5.8.8 with DTrace probes.

Sounds great, except for one thing…. I now have to learn enough about DTrace to use it 🙂 The patch that Alan and Richard have (or at least their DTrace scripts) seem to require a priori knowledge of the Perl process’ pid… not something thats going to work out for what I want to do.

For a quick test, DTrace -c ./view -s /export/home/sven/src/dtrace/subs-tree.d does show the program flow.

The following is while running some perl scripts – the 2 numbers are their pids.

# dtrace -l | grep -i perl
17803  perl17669        libperl.so                      Perl_pp_sort sub-entry
17804  perl17669        libperl.so                   Perl_pp_dbstate sub-entry
17805  perl17669        libperl.so                  Perl_pp_entersub sub-entry
17806  perl17669        libperl.so                      Perl_pp_last sub-return
17807  perl17669        libperl.so                    Perl_pp_return sub-return
17808  perl17669        libperl.so                     Perl_dounwind sub-return
17809  perl17669        libperl.so                Perl_pp_leavesublv sub-return
17810  perl17669        libperl.so                  Perl_pp_leavesub sub-return
88501  perl17760        libperl.so                      Perl_pp_sort sub-entry
88502  perl17760        libperl.so                   Perl_pp_dbstate sub-entry
88503  perl17760        libperl.so                  Perl_pp_entersub sub-entry
88504  perl17760        libperl.so                      Perl_pp_last sub-return
88505  perl17760        libperl.so                    Perl_pp_return sub-return
88506  perl17760        libperl.so                     Perl_dounwind sub-return
88507  perl17760        libperl.so                Perl_pp_leavesublv sub-return
88508  perl17760        libperl.so                  Perl_pp_leavesub sub-return

so… first ignorant modification – in subs-tree.d, it wants to trace perl$target:::sub-entry – change that to perl*:::sub-entry, and of course, it works exactly as I want – attaches to all subsequent perl process (running my dtrace-perl build) and tells me whats going on. The only caveat being that the DTrace script will only start if there is a Perl process running – the provider is obviously not persistent.

Brilliant!

Should be a fun Christmas holiday adventure – 410 pages of dtrace book, and a myriad of web pages to consume and digest.

TWiki (4.2 rc2) Microsoft Windows, OSX and rpm (Centos & Fedora Core i386) installers

logoed_installer.jpg

Release Candidate 2 is pretty close to what will be released within the next month.

These Windows, OSX, Centos and Fedora Core installers are fully integrated native installers that will update your Computer with perl, apache, rcs and other tools needed to run TWiki on that platform.

TWiki 4.2.0 contains many new improvements to TWiki, including a much improved Wysiwyg editor, a structured query engine, a more generic authentication system and at the same time, the Core engine is faster than the previous twiki4 releases.

The TWiki installers include native installs of (only installed if not already)

  1. Apache 2.2 (Windows & rpm)
  2. Perl (ActiveState – Windows & native for rpm)
  3. Gnu Grep (Windows only)
  4. Gnu rcs (All platforms)
  5. TWiki 4.2.0 Release Candidate 2.

Please download it, try it out and report your impressions, ideas, bugs and successes here, on TWiki.org, or in the TWiki Bugs system.

Another TWiki innovation brought to you by distributedINFORMATION & WikiRing.com

How to demo software – the advanced version.

Joel has written a great article on howto demo software. So good in fact, that it reminded me of my most successful demos – All of which took the advice one step further.

Imagine:

You walk into the room, and before you’ve even gotten to the lectern / desk / stage prop, you tell the audience, that you decided that your pre-prepared demo was too boring to present again, so you ask them, “What problem would you like me to solve for you today”. After the moments shock dies down, you can (assuming your audience is big enough to contain a good cross section of existing customers) be pretty sure that there will be at least 2 difficult problems that are not only fascinating to most of the audience, but were hard to do in the last release.

Then, you proceed to solve these problem, using techniques that seem familiar to them, but also show off the new system. It’s sure to draw them in.

This approach relies on several incredibly important things

  1. You must know your product incredibly well (both the older version, and the new release) – In my case, I had worked as a trainer, support engineer and had done some development of the system.
  2. You must know your audience, and have a good feel for the problems they have been experiencing, and their expectations of the new system. Again, working as a support engineer, and supporting Systems integrators gives you the opportunity to observe.
  3. You must be creative, and be able to think, talk, and type at the same time – So having several years training experience helps immeasurably.
  4. You must also trust the development team – because there’s a good chance that you’ll be needing to solve the problem in a way you’ve not done before.

If you’re not quite willing to risk it, this can also be a great way to spice up a training course – you can teach, and solve problems that are relevant.

Every time I’ve done a demo like this (and thinking back, thats quite a few, for quite a few different products) its been the most fun I’ve had all day, and the audience loves it, because they get to see us sweat, rather than being the cool calm font of knowledge.