2018/09/27

Devel::MAT investigation into C - part 3

In the previous part we investigated a memory leak in a Perl program, and found the line of C code in a module responsible for creating the SVs that are leaking. This of course is only half of the problem - the other half is the fact that these SVs aren't getting reclaimed again when they're no longer needed, so to answer that question we'll have to investigate a little more. We'll attempt to go looking around the code at where this SV pointer is stored, and look for other parts of code that would try to reclaim it later on, and see if we can see why they don't.

Taking a look again at the source surrounding the line that is creating these leaking SVs, we see an interesting call to a function called cb_data_advanced_put() which appears to be taking ownership of it.

cb_data_advanced_put(ctx, "tlsext_status_cb!!func", newSVsv(callback));
cb_data_advanced_put(ctx, "tlsext_status_cb!!data", newSVsv(data));
SSL_CTX_set_tlsext_status_cb(ctx, tlsext_status_cb_invoke);

This cb_data_advanced_put() is defined by the XS code itself, storing the data under the named key in a hash associated with the SSL context stored in ctx. The code implementing that looks like this:

int cb_data_advanced_put(void *ptr, const char* data_name, SV* data)
{
    HV * L2HV;
    SV ** svtmp;
    int len;
    char key_name[500];
    dMY_CXT;

    len = my_snprintf(key_name, sizeof(key_name), "ptr_%p", ptr);
    if (len == sizeof(key_name)) return 0; /* error  - key_name too short*/

    /* get or create level-2 hash */
    svtmp = hv_fetch(MY_CXT.global_cb_data, key_name, strlen(key_name), 0);
    if (svtmp == NULL) {
        L2HV = newHV();
        hv_store(MY_CXT.global_cb_data, key_name, strlen(key_name), newRV_noinc((SV*)L2HV), 0);
    }
    else {
        if (!SvOK(*svtmp) || !SvROK(*svtmp)) return 0;
        L2HV = (HV*)MUTABLE_PTR(SvRV(*svtmp));
    }

    /* first delete already stored value */
    hv_delete(L2HV, data_name, strlen(data_name), G_DISCARD);
    if (data!=NULL)
        if (SvOK(data))
            hv_store(L2HV, data_name, strlen(data_name), data, 0);

    return 1;
}

This function takes ownership of the SV pointer given in data by storing it in a hash - an HV. That HV is a second-level hash, itself stored keyed by the given context pointer ptr into a hash maintained by the toplevel context in MY_CXT.global_cb_data. To understand the intended code path to release this, we'll have to find where that gets reclaimed. Inspection of the code that manages this top-level hash, and surrounding functions in the file, soon finds another function which appears to be doing that:

int cb_data_advanced_drop(void *ptr)
{
    int len;
    char key_name[500];
    dMY_CXT;

    len = my_snprintf(key_name, sizeof(key_name), "ptr_%p", ptr);
    if (len == sizeof(key_name)) return 0; /* error  - key_name too short*/

    hv_delete(MY_CXT.global_cb_data, key_name, strlen(key_name), G_DISCARD);
    return 1;
}

The important line here is the call to hv_delete(), which is deleting a key from the shared HV stored in MY_CXT.global_cb_data. When that HV is deleted, Perl should recurse into it and reclaim all of the values stored by it, including the data we put there earlier. But yet it appears not to be - or else we wouldn't observe our memory leak in the first place - so we must dive a little deeper to understand why.

Lets start by dumping the contents of the hash that hv_delete() is discarding, to see if it contains the keys we expect - remember, from the first block of code we found, we're expecting an entry under the key named tlsext_status_cb!!data. We'll do that by adjusting the code and adding a call to the handy debugging function sv_dump() so we can see what's inside it:

SV *tmp = hv_delete(MY_CXT.global_cb_data, key_name, strlen(key_name), 0);
sv_dump(tmp);

When we run this we get a nice detailed snapshot of what was inside the HV at the time it was cleaned up. The full output is quite long, but here is a cut-down version of the most relevant detail:

SV = IV(0x55c769db3358) at 0x55c769db3368
  FLAGS = (TEMP,ROK)
  RV = 0x55c769db2f78
    SV = PVHV(0x55c769dd9280) at 0x55c769db2f78
      KEYS = 1
        Elt "tlsext_status_cb!!func" HASH = 0x567bfc0c

We have a reference to a hash, containing just one key (because KEYS = 1), being the tlsext_status_cb!!func part, but it seems we're missing the related !!data key, which is the leaking one. But this seems unusual - the code path we encountered earlier would always unconditionally insert them both. Either the data key went missing sometime, or never got inserted in the first place.

Perhaps it would help to add some further logging to the cb_data_advanced_put function, having it log what the HV looks like after it's added the key, so we can see if it was never added or goes missing later on. We'll add this to the end of the function:

fprintf(stderr, "After cb_data_advanced_put key=<%s>:\n", data_name);
sv_dump(sv_2mortal(newRV_inc(L2HV)));

(The added complication of calling sv_2mortal on newRV_inc is simply to get around an odd quirk of sv_dump(), that it will dump the contents of a hash given by indirect RV reference, but not by HV pointer directly. We have to make and dump an RV referencing it, just so we can see those keys.)

Running this, we find some more logs from its behaviour:

After cb_data_advanced_put key=:
SV = IV(0x55932b203ec0) at 0x55932b203ed0
  FLAGS = (TEMP,ROK)
  RV = 0x55932b1db110
    SV = PVHV(0x55932b20c4d0) at 0x55932b1db110
      KEYS = 2
        Elt "ssleay_verify_callback!!func" HASH = 0x4101f92c
        Elt "tlsext_status_cb!!func" HASH = 0x462de35f

After cb_data_advanced_put key=:
SV = IV(0x55932b204130) at 0x55932b204140
  FLAGS = (TEMP,ROK)
  RV = 0x55932b1db110
    SV = PVHV(0x55932b20c4d0) at 0x55932b1db110
      KEYS = 2
        Elt "ssleay_verify_callback!!func" HASH = 0x4101f92c
        Elt "tlsext_status_cb!!func" HASH = 0x462de35f

Well this is odd. After adding the func key, it was indeed there (along with another relating to the verify callback, which we can ignore for our purposes), leaving it with KEYS = 2. But after attempting to put the data data in, it still only has those two and no further key was added. It looks like the function didn't attempt to add the key to the HV.

To understand why, a further look at a small section of the code in cb_data_advanced_put() might help here:

if (data!=NULL)
    if (SvOK(data))
        hv_store(L2HV, data_name, strlen(data_name), data, 0);

The hv_store() function is only called if data points to a real SV, but contains some defined value (i.e. SvOK() is true). But recalling the leaking SV candidates we found in part 1 - they were all UNDEF. It seems perhaps we've found the reason. If the data for the tlsext_status_cb is an undefined scalar value, a new SV is allocated for it (by calling newSVsv()) which is never stored in the HV, and thus is never reclaimed when the HV itself is released. Instead, the SV just sits around in memory, unreferenced by anything else, with nothing to clean it up afterwards.

If this is the case, then we can easily fix it by just throwing away the SV immediately if it is undefined. We can change the code to:

if (data!=NULL) {
    if (SvOK(data))
        hv_store(L2HV, data_name, strlen(data_name), data, 0);
    else
        SvREFCNT_dec(data);
}

Applying this to the code, rebuilding, and testing again shows this does seem to fix this bug, as now we no longer get a constantly-increasing SV count:

SVs 49656 (+49656)
SVs 56515 (+6859)
SVs 56509 (-6)
SVs 56509 (+0)
SVs 56509 (+0)
SVs 56508 (-1)
SVs 56508 (+0)

All that remains now is to send it upstream in the form of a bug report - which I have done in RT#127131.

2018/09/24

Devel::MAT investigation into C - part 2

In the previous part we discovered and started investigating a memory leak bug in a program, which got as far as needing a custom-built perl binary to look further into where the leak comes from. We built that with perlbrew and some specific compiler flags.

This new memory-logging feature is controlled by setting an environment variable (PERL_MEM_LOG) to specify the file descriptor to write the log to, and some flags to say what kinds of events will cause logs to be written. This log won't be immediately helpful to us on its own, but when combined with the analysis from Devel::MAT will let us see the history of each individual SV, and specifically the ones that pmat-leakreport tells us are our leak candidates.

To see the logging in action, first lets check that it is working on a test run of perl -V. We'll ask it to output information about SV allocations and deletions to file descriptor 3 (so as not to disturb the process's normal output on the standard output or error streams), and send that to a file:

$ PERL_MEM_LOG=3s debugperl5.26.2 -V 3>perlmem.log
Summary of my perl5 (revision 5 version 26 subversion 2) configuration:
   
...

$ wc -l perlmem.log
8170 perlmem.log

$ head -5 perlmem.log 
new_SV: sv.c:9377:Perl_newSVpvn: 564a933f1f78
new_SV: sv.c:9678:Perl_newSV_type: 564a933f1f90
new_SV: sv.c:9678:Perl_newSV_type: 564a933f1fa8
new_SV: sv.c:9678:Perl_newSV_type: 564a933f1fc0
new_SV: sv.c:9377:Perl_newSVpvn: 564a933f1fd8

$ tail -5 perlmem.log 
del_SV: sv.c:7075:Perl_sv_free2: 564a93407e90
del_SV: sv.c:7075:Perl_sv_free2: 564a93407d88
del_SV: sv.c:7075:Perl_sv_free2: 564a93407de8
del_SV: sv.c:6805:Perl_sv_clear: 564a934077e8
del_SV: sv.c:7075:Perl_sv_free2: 564a933f2038

Here we've got a nice long file, with one line per memory allocation. We can now run our leaky process again, this time with the memory log added as well. This time, when pmat-leakreport gives us some candidates we'll be able to see in more detail where they came from, at the C level:

$ PERL_MEM_LOG=3s debugperl5.26.2 ... 3>perlmem.log
...

$ pmat-leakreport -1 perl-{4,5,6,7,8}.pmat
LEAK[1] UNDEF() at 0x564e10091f80
LEAK[1] UNDEF() at 0x564e107ae740
LEAK[1] UNDEF() at 0x564e10794008

$ grep 564e10091f80 perlmem.log
...
new_SV: sv.c:9248:Perl_sv_newmortal: 564e10091f80
del_SV: sv.c:7075:Perl_sv_free2: 564e10091f80
new_SV: sv.c:9678:Perl_newSV_type: 564e10091f80
del_SV: sv.c:6805:Perl_sv_clear: 564e10091f80
new_SV: sv.c:9753:Perl_newSVsv: 564e10091f80

Well that's partly useful - we can see that the final memory operation at that address was indeed an allocation, with some previous history there which we can ignore. The program did allocate an SV at that address, that wasn't further touched until the end of the program. Problem is, this report itself isn't all that helpful - all we learn from it is that the function that allocated this SV was Perl_newSVsv. That's not saying much - about half of the SVs ever allocated by a program will be through that function. If we want to dig deeper even than this, we'll have to find a way to get the C-level stack trace at the time that allocation was made. To do that we have no alternative but to go editing perl core source code and rebuilding it once more.

We'll abuse the build directory left behind by perlbrew, as it provides a convenient place where the source is already unpacked and configured ready to build. Lets go find it, and edit the source code to add a little extra tracing.

We start by configuring the source tree to include the C backtrace feature, and then apply a patch to util.c file, to have it call the backtrace function (the source of the patch is attached at the end of this blog post):

$ rm config.sh
$ ./Configure -DDEBUGGING -Accflags=-DPERL_MEM_LOG -Dusecbacktrace
$ patch -p1 < print-c-backtrace-on-memlog.patch
$ make

Now when we run this and pass the new c flag as part of PERL_MEM_LOG, we'll get 3 layers of C stack trace along with each line in the memory log. This will tell us what C code called Perl_newSVsv to create our leaking scalars:

$ PERL_MEM_LOG=3sc debugperl5.26.2 ... 3>perlmem.log
...

$ pmat-leakreport -1 perl-{4,5,6,7,8}.pmat
LEAK[1] UNDEF() at 0x55f72bd4b4f0
LEAK[1] UNDEF() at 0x55f72c015088
LEAK[1] UNDEF() at 0x55f72bd57238

$ grep -A3 55f72bd4b4f0 perlmem.log
...
new_SV: sv.c:9753:Perl_newSVsv: 55f72bd4b4f0
  frame[0]: 0x55f72a04d4c3 Perl_newSVsv at /home/leo/perl5/perlbrew/build/perl-5.26.2/perl-5.26.2/perl +0x16d4c3
  frame[1]: 0x7f4644e5af73 - at /home/leo/perl5/perlbrew/perls/perl-5.26.2/lib/site_perl/5.26.2/x86_64-linux/auto/Net/SSLeay/SSLeay.so +0x78f73
  frame[2]: 0x55f72a019082 Perl_pp_entersub at /home/leo/perl5/perlbrew/build/perl-5.26.2/perl-5.26.2/perl +0x139082

Now we are definitely getting somewhere. We know the call to Perl_newSVsv was made from Net::SSLeay, and we have the path to the object file, and the offset address within it. The helpful addr2line utility can convert that into something more meaningful:

$ addr2line -fe /home/leo/perl5/perlbrew/perls/perl-5.26.2/lib/site_perl/5.26.2/x86_64-linux/auto/Net/SSLeay/SSLeay.so 0x78f73
XS_Net__SSLeay_CTX_set_tlsext_status_cb
/home/leo/.cpan/build/Net-SSLeay-1.85-1/SSLeay.xs:6453

The leaked scalars were created within the XS function behind the Net::SSLeay::CTX::set_tlsext_status_cb function, and we even get a source file and line number. We can go and inspect the source code:

  6450              SSL_CTX_set_tlsext_status_cb(ctx, NULL);
  6451          } else if (SvROK(callback) && (SvTYPE(SvRV(callback)) == SVt_PVCV)) {
  6452              cb_data_advanced_put(ctx, "tlsext_status_cb!!func", newSVsv(callback));
  6453              cb_data_advanced_put(ctx, "tlsext_status_cb!!data", newSVsv(data));
  6454              SSL_CTX_set_tlsext_status_cb(ctx, tlsext_status_cb_invoke);
  6455          } else {
  6456              croak("argument must be code reference");

Sure enough, there is indeed our call to newSVsv().

So now we have a definite location in the code where we can see we are creating new SVs that don't get cleaned up again. This is a start, but is only the first step of fixing the bug. We also need to find out why it isn't being released again afterwards - for that we'll continue in part 3...


The patch to util.c to create this logging is:

2018/09/20

Devel::MAT investigation into C - part 1

I have previously written about using Devel::MAT to find the cause of memory leaks in Perl programs. While it is often able to find the reasons for memory growth in a great number of cases, there are times when it cannot alone find the whole story, and we must combine other tools to finish our investigation. Here we'll look at one particular case that - like so many others - begins with the observation that a long-running Perl process slowly consumes ever-more memory from the host operating system, the longer it runs. We'd like to find out why, and fix it.

The process under investigation this time is a long-running script that regularly polls a number of sites over HTTP (well, HTTPS in fact) to collect up and aggregate some data from them. The rate of memory leak is observed to correlate with how often it performs this polling, so it suggests that the leak is likely caused by something that is done on every iteration of the program losing the same amount each time. The regular iteration of the program provides us a convenient point to attach some memory debugging to begin our investigation.

A slow increase of memory could be caused by a Perl-visible accumulation of SVs, or could be something only visible at the C layer. Before we start investigating with the heavy guns of Devel::MAT, we might do well to try to classify what's going on with something more lightweight. In this case, Devel::Gladiator provides us a simple way to count the total number of SVs on the Perl heap, to see if it's increasing. We keep a count of the previous total, so we can print the difference per iteration as well:

use Devel::Gladiator qw( arena_ref_counts );
use List::Util qw( sum );

my $lasttotal = 0;

...

    # per iteration:
    my $total = sum values %{ arena_ref_counts };
    printf "SVs %d (+%d)\n", $total, $lasttotal - $total;
    $lasttotal = $total;

We can now run this and get an initial idea of the growth of the Perl heap:

SVs 51940 (+51940)
SVs 60171 (+8231)
SVs 59655 (-516)
SVs 59652 (-3)
SVs 59655 (+3)
SVs 59657 (+2)
SVs 59660 (+3)
SVs 59663 (+3)
SVs 59665 (+2)
SVs 59668 (+3)

After a few iterations of initial startup and settling down, this program is clearly leaking actual Perl-visible SVs at around 2 or 3 per iteration, so it makes sense to get Devel::MAT involved. We can easily insert that into the iteration loop:

use Devel::MAT::Dumper;

my $iter = 0;
...

    # per iteration:
    $iter++;
    Devel::MAT::Dumper::dump( "perl-$iter.pmat" );

Now when we run this we get a sequence of numbered .pmat files, one per iteration, that we can throw into pmat-leakreport to look for leaked scalars. Because the first few files will contain a lot of noise from program startup, we'll skip those and take from, in this case, the 4th onwards:

$ pmat-leakreport -1 perl-{4,5,6,7,8}.pmat
LEAK[1] UNDEF() at 0x55efee189120
LEAK[1] UNDEF() at 0x55efee1ae1f8
LEAK[1] UNDEF() at 0x55efeda8c5e8
LEAK[1] UNDEF() at 0x55efee1acd30

This gives us four exciting candidates we can take more of a look at using the pmat shell:

$ pmat perl-8.pmat 
Perl memory dumpfile from perl 5.26.2
Heap contains 55890 objects
pmat> identify 0x55efee189120
UNDEF() at 0x55efee189120 is:
└─not found

pmat> identify 0x55efee1ae1f8
UNDEF() at 0x55efee1ae1f8 is:
└─not found

pmat> identify 0x55efeda8c5e8
UNDEF() at 0x55efeda8c5e8 is:
└─not found

pmat> identify 0x55efee1acd30
UNDEF() at 0x55efee1acd30 is:
└─not found

Well that's not very helpful. We have four scalars that contain no defined values, which aren't referenced by anywhere that Devel::MAT can see. In order to continue investigating this, we need to dig deeper with more tools. Perl itself provides an optional ability to write a more detailed log of memory allocation operations, controlled by a variable called PERL_MEM_LOG. According to perlhacktips(1):

PERL_MEM_LOG

If compiled with -DPERL_MEM_LOG (-Accflags=-DPERL_MEM_LOG), both memory and SV allocations go through logging functions, which is handy for breakpoint setting.

Now, we can't just set that environment variable on any normal perl binary, because the logging functions are sufficiently expensive at runtime that they are usually disabled on normal builds. We have to request them specifically with the use of some compile-time flags. To achieve this, it's easiest to use perlbrew to compile a custom binary for our purposes:

$ perlbrew install perl-5.26.2 --debug -Accflags=-DPERL_MEM_LOG

That'll take a while to build, but when it's ready we'll continue our investigation in part 2...


For another memory-leak investigation involving pmat-leakreport, see also the Binary.com tech blog post Differential Memory Leak Analysis with Devel::MAT.

2018/09/03

Unicode 9 emoji support in pangoterm/libvterm

A small update to libvterm now means that it understands the width of all the characters in Unicode 9. Combined with a suitable font (such as, in this case, the fonts-noto-color-emoji debian package), this now means all the Unicode emoji characters can be nicely displayed by pangoterm.

2018/07/17

Devel::MAT::Dumper independent distribution

I've just released a new version of Devel-MAT, now with the heap dumping part split out into its own distribution, Devel-MAT-Dumper.

There's two reasons for this. Firstly, by being split into its own distribution, the heap-dumper part is smaller and lighter to install on places like production servers as it needs no non-core dependencies. Since the heapdump files still need analysis somewhere, you'll still have to install the full Devel-MAT distribution on a machine - such as a workstation or laptop - but you no longer need the analysis tools on every target machine.

Secondly, since the two parts are now in independent distributions, they can have different Perl version requirements. I want to keep the ability to analyse Perl programs running on versions from 5.10 onwards (and maybe sometime see if I can push that back to 5.8), but now the distributions are split I have begun to make use of 5.14 features in some of the analysis tools. This means that you can still analyse programs running on 5.10, but you will need an installation of 5.14 or later to run the tooling.

Looking a little further afield, I plan to continue writing more tooling, perhaps published as independent distributions in places too. It's still the case that almost every time I run into some situation that needs memory analysis like this, I end up writing more tools to help, so the flow of additional tooling will continue for a while yet.


I'm also keen to hear success stories from others (or even failure ones - what couldn't you work out and what might I do to help?) to help guide the creation of more tooling. I'm available to consult with people to help you with your own problems also - send me a note to

leonerd [at] leonerd [dot] org [dot] uk

and we can discuss your requirements.

2018/01/20

Async/await in Perl - control flow for asynchrony

I've decided that my Future::AsyncAwait CPAN module is sufficiently non-alpha that I've started migrating a few of my less critical code into using it. I thought I'd pick a few of the Device::Chip drivers for various kinds of chip, because they're unlikely to be particularly involved in anyone's real deployment code, as really I only wrote those to test out some ideas on the chips before writing microcontroller code in C for them. These seemed like good candidates to begin with.

Here's an example of a function in the previous version, using Futures directly. The code had lots of syntactical noise, some ->then chaining and the Future::Utils::repeat loop not looking like a regular foreach loop. You can just-about read what's going on but it's clear there's a lot of machinery noise getting in the way of really understanding the code.

By rewriting all the logic using await expressions inside an async sub we arrive at a version that much closer resembles the sort of thing you'd write in straight-line synchronous code. In reading it you can just skim over the awaits while looking at it and read it like synchronous code.

A question you might begin to ask at this point is why I'd choose to implement this particular set of syntax or semantics, of the various possibilities for how to manage asynchronous control flow. Aside from its general neatness and applicability to Futures (which I've already worked with at length), there's one key reason: The async/await syntax here is the exact same thing as implemented in Python 3, ES6, C#5, Dart, even Rust is currently considering adopting it Yes, it's nice to have a good concurrency model built into the language, but it's considerably stronger if it's the same as the consensus among a variety of other languages too.

Some language references for them:

Python Tasks and coroutines
JavaScript async function
C# Asynchronous Programming
Dart Dart Language Asynchrony Support

If the four quite semantically-different languages of Python, JavaScript, C# and Dart can all come to the same idea, then maybe it has merit. I honestly think that given a few years, async/await could become as ubiquitous as if or while loops, to the level of "well obviously our language has that". This is why I wanted to steal it into Perl. In ten years time it might look as foolish for a language not to have an async/await construct, as it does today for it not to have a try/catch or a switch.

Ah... more on that subject another day perhaps ;)

2017/11/16

Developing against $HOME/lib libraries and LD_LIBRARY_PATH

I've come up against an awkward situation using libtool, and I wonder if I could ask for some advice on what I'm probably doing wrong here.

I'm developing two different C libraries; lets just call them A and B. A is fairly standalone, and B depends on A.

Using libtool, I can develop on A fine. In particular, it has some internal unit-tests which I run using libtool --mode=exec. These run OK.

Using libtool, I can develop on B just fine, provided the A library is installed as a real system package. The linker can find it by the normal mechanisms and it lives in /usr/lib along with other system things. All is good. Library B also has some unit-tests that are executed with libtool --mode=exec. So far so good.

But now suppose I have a possible change I want to make that needs edits in both libraries. I don't want to build a new system package yet and upset all the other users on this box. That's fine, I can just install library A into my $HOME directory using

$ make install PREFIX=$HOME

In order to be able to test programs that use library A, I'm going to need to set LD_LIBRARY_PATH now so they can find the library here in preference to the system-installed one, so in my .profile I use

export LD_LIBRARY_PATH=$HOME/lib

But now, that upsets every use of libtool --exec to run any of the unit tests for my libraries in their source code (for both libraries A and B). With LD_LIBRARY_PATH set in the environment, the libtool wrapper script no longer sets it up at all, meaning that libtool --exec on a unit test just sees the system-packaged versions of each library. This means I can't test the new code I just write in library A, because the unit-test executables don't see it.

I could make unit tests for library A itself run fine by entirely defeating the LD_LIBRARY_PATH mechanism again, if I

$ LD_LIBRARY_PATH= make test

But what of library B? It has to be able to see its own locally-build library in $CWD/.libs and library A in $HOME/lib. So I could

$ LD_LIBRARY_PATH=`pwd`/.libs:$LD_LIBRARY_PATH make test

But at that point I really feel like I am fighting against libtool, rather than it helping me do this right.

So, what am I doing wrong here?

Alternatively, am I using this entire setup in its intended way, but libtool just isn't playing ball with me?