2014/09/24

Perl module idea for making bitfields

Background

I've been doing a lot of hardware IO work lately, which involves lots of talking to hardware devices, where there are byte-wide registers that store little bitfields, sometimes of individual and unrelated bits packed together. To make this easier I tend to write myself lots of pairs of functions to pack/unpack the fields in one of these bytes; for instance:

use constant {
      MASK_RX_RD      => 1<<6,
      MASK_TX_DS      => 1<<5,
      MASK_MAX_RT     => 1<<4,
      EN_CRC          => 1<<3,
      CRCO            => 1<<2,
      PWR_UP          => 1<<1,
      PRIM_RX         => 1<<0,
};

sub unpack_CONFIG
{
   my ( $config ) = @_;
   return
      MASK_RX_RD  => !!( $config & MASK_RX_RD ),
      MASK_TX_DS  => !!( $config & MASK_TX_DS ),
      MASK_MAX_RT => !!( $config & MASK_MAX_RT ),
      EN_CRC      => !!( $config & EN_CRC ),
      CRCO        => $CRCOs[!!( $config & CRCO )],
      PWR_UP      => !!( $config & PWR_UP ),
      PRIM_RX     => !!( $config & PRIM_RX );
}

sub pack_CONFIG
{
   my %config = @_;
   return
      ( $config{MASK_RX_RD}  ? MASK_RX_RD  : 0 ) |
      ( $config{MASK_TX_DS}  ? MASK_TX_DS  : 0 ) |
      ( $config{MASK_MAX_RT} ? MASK_MAX_RT : 0 ) |
      ( $config{EN_CRC}      ? EN_CRC      : 0 ) |
      ( ( _idx_of $config{CRCO}, @CRCOs )
         // croak "Unsupported 'CRCO'" ) * CRCO |
      ( $config{PWR_UP}      ? PWR_UP      : 0 ) |
      ( $config{PRIM_RX}     ? PRIM_RX     : 0 ) );
}

This convenient pair of functions bidirectionally converts bytes into key/value lists. As well as containing 6 simple boolean values, there's also an enumerated field, represented by the values in the array @CRCOs. These functions convert those too.

I'm getting tired of writing these pairs of functions. Hey, this is Perl right? :) I should get Perl to write them for me.

TL;DR - I propose the following

I'd like instead to write something like:

use bitfield;

bitfield CONFIG =>
   MASK_RX_DS  => boolfield(6),
   MASK_TX_DS  => boolfield(5),
   MASK_MAX_RT => boolfield(4),
   EN_CRC      => boolfield(3),
   CRCO        => enumfield(2, qw( values here )),
   PWR_UP      => boolfield(1),
   PRIM_RX     => boolfield(0);

The operation here is that 'bitfield' is automatically exporting a function called bitfield, along with the various *field() functions that create field definitions. boolfield() declares a single true/false boolean bit position, enumfield() declares a range of bits that give an enumeration. I guess also would be required intfield() to use a range of bits as an integer, and finally most likely customfield() taking a pair of conversion CODE refs or somesuch.

What does anyone think to that?

2014/09/07

Released: Tickit 0.47

New up in Tickit 0.47, some fairly small and incremental updates:

  • Support the 'blink' terminal attribute

    Both in libtickit C library and Tickit perl module now support the blink attribute, much to my hesitation. ;)

    I'm not sure I want to encourage this sort of thing, but the Neovim project said they wanted this, so I've reluctantly added support for it all the same.

  • Bugfix for renderbuffer 'get*' methods

    When offset and clipping are applied, previously the get* methods didn't pay attention to this, fetching content relative to the toplevel, or segfaulting if requested out of bounds. This has now been fixed.

  • Tickit::Widget::HBox and ::VBox have now been moved to the Tickit-Widgets distribution.

    This dist is now linked to explicitly from the documentation. This supports the longterm goal of turning 'Tickit' into purely the window-layer downwards, and having all the widget support live in its own distribution, backed eventually by its own C library.

  • Nicer handling of fallback terminfo attributes for definitions missing them.

    Certain distributions of terminfo databases seem to be lacking certain essential attributes for some termtypes. Such examples as the upstream 'screen' terminfo is lacking erase_chars. libtickit now includes some fallback "likely to work" strings to handle these cases. This turns it from an instant failure on startup, to an at-worst wrong output, but hopefully most terminals should understand these standard strings. At least, if they don't I suspect libtickit is far from the only place that's broken, if they don't supply a more correct string in their terminfo.

2014/07/27

Event Reflexivity - A Design Pattern Pattern?

The previous series of posts on the topic of Event Reflexivity each posed a question about what the general shape of the design pattern actually is. Over the following months after the posts I thought about the pattern a lot more and eventually came to the conclusion that a lot of the questions aren't a simple matter of choosing what's correct - the reason these questions are hard to answer is that both options could be correct. What I have here is in fact not a single unclear design pattern, but instead a whole family of possible choices on a design pattern, with different specifics being more useful to specific cases.

The specific design choices that a particular implementation takes should answer such questions as:

  • Ordering control: Are subscribers of some named action invoked in a controllable fashion?
  • Invocation functions: What kinds of actions or ways of invoking them are actually provided?
  • Heirarchial actions: Do the types of actions occupy a simple flat namespace, or is there some heirarchial structure to them? Can a subscriber catch an entire subspace of actions at once?
  • Explicit or implicit subscription: Do subscribers have to explicitly list every action (or action subspace) they wish to receive?
  • Filterable arguments: Does the implementation offer a built-in way to filter specific values of arguments by some pre-declared pattern?

Perhaps the only real design question for the Event-Reflexive Design Pattern Pattern is to decide on some neat concise language that implementations can use to explain their specific choices on these issues.

2014/06/06

List::Util additions in Perl 5.20

For a while now, I have taken over maintaining List::Util and Scalar::Util, the utility modules that ship with core Perl. After a while of getting used to what's where, I've actually now started adding things to it again; mostly by surveying what's commonly used from some other utility modules, and bringing them in so they can be nicely implemented in XS for efficiency. Now that Perl 5.20 is out, all of these latest updates now ship with core perl.

From List::MoreUtils, I have taken the four shortcutting reduce-like boolean test functions of all, any, none and notall. These are all similar to grep, in that they take a block that evaluates some predicate test on each element of the list. Where they differ from grep, is that grep will count the total number of items in the list that returned true, whereas these four functions will simply indicate what the overall result was; allowing them to short-circuit as soon as the result is determined.

use List::Util 1.33 qw( any );

if( any { $_ == 0 } @numbers ) {
  say "The list of numbers includes zero";
}

As this module ships as part of the Perl core, it can reliably make use of the C compiler to build it, so most of the functions it contains are implemented in efficient XS code. Specifically these four also use an optimisation technique called MULTICALL, which improves the efficiency of functions of this form, where a given small block of code is repeatedly executed many times, with $_ set differently every time.

Another set of functions copied from elsewhere are the pair* functions taken and extended from List::Pairwise. These are all functions that interpret their list as an even-sized list of pairs, executing the code block with $a and $b set to the first and second value of each pair. This could be used to operate on regular perl hashes (by assigning keys to $a and their associated values to $b), though there is no requirement that it really be a hash. The functions will preserve the order of the pairs, and won't get upset if the "keys" are not plain strings, or not unique. As the result is also returned in a list of pairs, it could be assigned into a hash, or used elsewhere.

use List::Util 1.33 qw( pairgrep pairmap );

# Take a subset of a hash whose keys are ALLCAPS
my %capitals = pairgrep { $a =~ m/^[A-Z]+$/ } %hash

# Rename keys in a hash
my %renamed = pairmap { ($a =~ s/^foo_/bar_/r), $b } %hash

(This latter example also makes use of the perl 5.14 s///r flag, to return the result of the substitution instead of editing in place.)

use List::Util 1.33 qw( pairs );

foreach ( pairs %hash ) {
   # $_ will be a 2-element ARRAY ref
   say "$_->[0] has value $_->[1]";
}

As of version 1.39 (so a little too late to make it into perl 5.20.0, but still available on CPAN), pairs returns blessed array references that respond to methods called key and value (inspired by DCONWAY's Var::Pairs), as well as being accessible by array indexing.

use List::Util 1.39 qw( pairs );

foreach ( pairs %hash ) {
   say $_->key, " has value ", $_->value;
}

I have many more ideas for functions that could be added, though some care will need to be taken not to invent experimental ideas; but instead to take inspiration of tried-and-tested from CPAN, as all these have done, to bring into core and standardise existing ideas.

One other thing I have my sights set on is to implement further-optimised versions of at least some of the functions in Scalar::Util and List::Util as custom ops on perl 5.16 onwards. This will give them an even further performance boost, as they won't even be regular XS functions any more, so will completely remove the expensive call-time overhead of the ENTERSUB/LEAVESUB pair.


I should also add, that for a while now I've been a self-employed IT contractor, which has given me a lot more free time to be able to write such things as named above. If anyone is interested in supporting or sponsoring similar work on Perl, contact me by email. I'd be happy to give most reasonable Perl jobs a consideration. For that matter, I also work in C, or other languages, and I've even been known to build small-scale electronics projects.

2014/03/20

Event-Reflexivity in Static Languages

<< First | < Prev

In the previous article, we looked at the idea of explicit registration of handlers for events in an event-reflexive system, and touched on the idea that it may be more useful (or in fact, required) when dealing with a static language like C, rather than a dynamic langauge like Perl. Today's story will look at the different requirements for static languages in more detail.

In the examples in previous posts we have been able to use dynamic language features to easily implemented named actions by simply creating functions of the right name, and dispatch to them simply by passing that string name to the central dispatch functions.

# In a handler module
sub add_user
{ ... }

# In dispatching code
run_events "add_user", @args;

This became especially useful when creating dynamic action names in the IRC cases

# In a handler module
sub on_message_JOIN
{ ... }

sub on_message
{ ... }

# In dispatching code
run_events [ "on_message", "JOIN" ], @args;

To be clear here, we have used the following abilities of Perl (though similar should apply to most dynamic languages) in order to easily implement this system:

  • The ability to invoke a function in a module based on a dynamic string at runtime
  • The ability to pass a variable list of arguments to a function as a simple list, without the intermediate dispatcher function having to understand them
  • The ability to return any kind of value or list of values from a function

In a static language, this simply isn't going to work. We'll need something much stronger to bind all these pieces together. We'll need a way to more strongly identify the named actions as hook points, some way to pass the arguments around between them, and some way to interpret the return values for possible methods of combination or short-circuit return

The first idea to handle this is simply to number the events in some sort of globally-defined enumeration. But this of course creates a single global numbering, and half of the point of event-reflexivity in the first place was to avoid this kind of centralisation - a central numbering would mean that plugin modules couldn't themselves create and emit new events. They can't just invent new numbers because they might collide with existing ones.

Perhaps what is required here is that the event-reflexivity core can allocate contiguous blocks of numbered events, and allow some kind of association between event numbers and friendly string names for the convenience of programmers and users. When a new module wishes to allocate some events, it can request a block from the core, and be given its starting number. While that number would be dynamically allocated between different instances of the system, or even different runs on the same machine, it would at least be constant throughout one program run, allowing other modules to bind or invoke them. The friendly naming system would then exist to allow programs to look up the current number for a known event name, to bind or invoke it.

typedef ERcore_event_id int;

ERcore_event_id ercore_allocate_events(int n_events,
                                       const char *evname[]);
ERcore_event_id ercore_lookup_event(const char *name);
const char *ercore_event_name(ERcore_event_id id);

Having a way to create the identity of named events, we next need a way to register a function to actually handle them. This is where our second problem arises - the problem of how to pass event arguments. Perhaps the simplest is simply to allow a single void * pointer argument, on the basis that the named event would document somewhere what this was supposed to point at - likely a structure of some kind. Because C lacks the ability to create true closures, it may be necessary to pass a second pointer argument at binding time, and passing both that and the event argument to the invoked functions at dispatch time.

void ercore_bind_event(ERcore_event_id id,
                            void (*fn)(ERcore_event_id id,
                                       void *args,
                                       void *data),
                            void *data);
void ercore_run_event(ERcore_event_id,
                      void *args);

Here we've only defined the simplest of the invocation functions, run_events, because any of the others would require some consideration of the return value as well. Here we start to run into problems of needing to know how to interpret the meanings of these values. This means we can't implement any of the more interesting invocation methods as seen in the second post (Kinds of Invocation in Event-Reflexive Programming).

Our handling of arguments isn't very satisfactory, forcing invoking code to always pack their arguments into a structure, and all the handling code to unpack them from it again. We've also made our arguments totally opaque to the actual dispatch system, meaning we can't do any of the interesting tricks we could do in dynamic languages where these are visible (such as those seen in the third post, Instrumentation and Logging in Event-Reflexive Programming).

Finally, by making the event identity a simple ID number and having opaque argument structures that the event dispatch core cannot inspect, we have lost our ability to perform dynamic dispatch based on some of the arguments, as seen in the fourth post (Hierarchies of Actions in Event-Reflexive Programming).

We have seen in previous posts that all of these abilities are useful things to have, in combination they define the essential nature of what Event-Reflexivity is really all about. It would be nice if we could have these things in static languages as well as dynamic ones.

This then leads to the final, and most expansive question of this series of posts:

How do you create a powerful system of event-reflexivity in a static language such as C? How do you cope with combining return values and short-circuit evaluation in those dispatch modes that require them? How do you pass different kinds of arguments to invoked functions in a useful and simple way? And how do you perform dynamic multiple dispatch on of pieces of the event identity?

2014/03/13

Event Registration in Event-Reflexive Programming

<< First | < Prev

Continuing our recent theme of an IRC bot, the next step in the story concerns the suggestion someone once made to me that given the way most of the internals of this bot worked, it shouldn't be too hard to have these events broadcast over some kind of IPC socket or similar, between multiple processes, to allow parts of the bot to be written out-of-process. Indeed, given the right kind of serialisation, there's no reason these extra parts had to even be perl, they could be implemented in a different language.

This idea has come up twice now in two different concepts, so I decided to think about it in some more detail. In principle the idea is sound enough, but as ever the devil comes down to the details. If every event was serialised and broadcast to every listening process, the IPC overheads could get very large, because most of the time most of the processes would ignore it. The simple form of event-reflexivity we have been using up to now has relied heavily on the very cheap (virtually free) cost of introspection within the code of one process, but now we need to find a better way to implement it.

The obvious way to start this is some kind of registration system. When each process connects to the central core, it starts off telling the core which events it is interested in, perhaps in a set of strings, or regexp matches, or something. This is a good first step in cutting down plenty of unwanted noise over the serialisation links, and generally improves things. This filtering doesn't have to be perfect as each connected process can still state it isn't interested in specific events it still manages to receive, but anything we can do on the core side to cut that down will obviously help.

However, further consideration of the specific domain of interest in being an IRC bot starts to suggest we can do something more powerful. Within IRC, it's quite likely that most events of interest to plugged-in processes will concern some specific IRC channel or user. It's also quite likely that at least some plugins may be interested only in events on specific channels or users, or matching only specific text, or some other criteria. If we could get the core event distribution mechanism to filter on these as well, we can further cut down on pointless IPC overheads.

The full implications and decisions of how this might work aren't really related to event-reflexivity, but what is of interest here is that this kind of event registration system doesn't have to be only for out-of-process management. In fact, as soon as we start to consider how event-reflexive programming might be implemented in a static language like C, as compared a dynamic language like Perl, we fairly soon conclude that there must be some kind of registration call, to hook up pieces of code and help in the event dispatch process on some level or another.

This leads us on to in fact two questions this time:

How useful is it to implement event reflexivity using explicit registrations of interest in events?
Does the answer depend on whether the language is static or dynamic? Can explicit registrations provide useful abilities even in dynamic languages, or they just add unnecessary complication

Next >

2014/03/06

Hierarchies of Actions in Event-Reflexive Programming

<< First | < Prev

So far this series we have seen the introduction to event-reflexive programming, and a couple of use-cases it would appear in. This time our story continues in chronological fashion, following the development of various IRC-related systems.

The first attempt at an IRC bot was a large soup of various event-reflexive concepts, and was the experimentation bench for a lot of my first ideas about it. One pattern I found very useful was to include partly-dynamic data in with action names. That is, to use information at runtime to direct the flow of event handling. In particular in IRC, the most obvious one comes from considering the command name in incoming IRC messages.

The simplest implementation of this could be expressed something like the following, presuming our underlying IRC implementation gives us simple objects to represent each message:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;

With this simple mechanism we now have a way for each plugin to react to specifically-named IRC events, without them having to capture all the events and filter for only the ones they care about.

However, it turns out that in a number of places we actually want to capture all the messages (for example, debugging and logging). No great problem here; we can simply make a second call to a generic on_message instead and pass in the command string itself as the first argument:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;
run_plugins "on_message", $command, $message;

A pattern seems to be emerging here. We can extend this further, for example to handle the specific CTCP message verbs in IRC CTCP messages (for now, don't worry if you don't know what CTCP means; just consider that it's a second sub-hierarchy of messages):

sub on_message_PRIVMSG {
  my ($message) = @_;

  if(message is CTCP) {
    my $verb = ...;

    run_plugins "on_message_ctcp_$verb", $message;
    run_plugins "on_message_ctcp", $verb, $message;
    run_plugins "on_message", "ctcp", $verb, $message;
  }
}

However, the mechanism we've built here still seems a little unsatisfactory. Any given plugin could handle more than one of these cases, meaning it would be called multiple times. Maybe it would be better to build it such that we only call the most-specific event handler that each plugin defines. To do that we would have to build that logic right in to the basic definition of run_plugins.

One possible idea would be to pass an array reference containing pieces of event name, which should be joined by underscores (_) until a suitable handling method is found, and the remaining pieces would be passed as the first positional arguments. Thus a call such as:

run_plugins [ "on_message", "ctcp", $verb ], $message;

would invoke handlers similarly to the previous example, except that it will call at-most one action handler per plugin, meaning that specific handlers "override" more generic ones that plugin provides.

The main question of this post is therefore

To what extent should arguments be interpreted as part of the dispatch of action handlers themselves? Should some arguments be allowed to take part in forming the action name itself, to allow a degree of override-like dispatch logic on a per-plugin basis?

Next >