2014/09/07

Released: Tickit 0.47

New up in Tickit 0.47, some fairly small and incremental updates:

  • Support the 'blink' terminal attribute

    Both in libtickit C library and Tickit perl module now support the blink attribute, much to my hesitation. ;)

    I'm not sure I want to encourage this sort of thing, but the Neovim project said they wanted this, so I've reluctantly added support for it all the same.

  • Bugfix for renderbuffer 'get*' methods

    When offset and clipping are applied, previously the get* methods didn't pay attention to this, fetching content relative to the toplevel, or segfaulting if requested out of bounds. This has now been fixed.

  • Tickit::Widget::HBox and ::VBox have now been moved to the Tickit-Widgets distribution.

    This dist is now linked to explicitly from the documentation. This supports the longterm goal of turning 'Tickit' into purely the window-layer downwards, and having all the widget support live in its own distribution, backed eventually by its own C library.

  • Nicer handling of fallback terminfo attributes for definitions missing them.

    Certain distributions of terminfo databases seem to be lacking certain essential attributes for some termtypes. Such examples as the upstream 'screen' terminfo is lacking erase_chars. libtickit now includes some fallback "likely to work" strings to handle these cases. This turns it from an instant failure on startup, to an at-worst wrong output, but hopefully most terminals should understand these standard strings. At least, if they don't I suspect libtickit is far from the only place that's broken, if they don't supply a more correct string in their terminfo.

2014/07/27

Event Reflexivity - A Design Pattern Pattern?

The previous series of posts on the topic of Event Reflexivity each posed a question about what the general shape of the design pattern actually is. Over the following months after the posts I thought about the pattern a lot more and eventually came to the conclusion that a lot of the questions aren't a simple matter of choosing what's correct - the reason these questions are hard to answer is that both options could be correct. What I have here is in fact not a single unclear design pattern, but instead a whole family of possible choices on a design pattern, with different specifics being more useful to specific cases.

The specific design choices that a particular implementation takes should answer such questions as:

  • Ordering control: Are subscribers of some named action invoked in a controllable fashion?
  • Invocation functions: What kinds of actions or ways of invoking them are actually provided?
  • Heirarchial actions: Do the types of actions occupy a simple flat namespace, or is there some heirarchial structure to them? Can a subscriber catch an entire subspace of actions at once?
  • Explicit or implicit subscription: Do subscribers have to explicitly list every action (or action subspace) they wish to receive?
  • Filterable arguments: Does the implementation offer a built-in way to filter specific values of arguments by some pre-declared pattern?

Perhaps the only real design question for the Event-Reflexive Design Pattern Pattern is to decide on some neat concise language that implementations can use to explain their specific choices on these issues.

2014/06/06

List::Util additions in Perl 5.20

For a while now, I have taken over maintaining List::Util and Scalar::Util, the utility modules that ship with core Perl. After a while of getting used to what's where, I've actually now started adding things to it again; mostly by surveying what's commonly used from some other utility modules, and bringing them in so they can be nicely implemented in XS for efficiency. Now that Perl 5.20 is out, all of these latest updates now ship with core perl.

From List::MoreUtils, I have taken the four shortcutting reduce-like boolean test functions of all, any, none and notall. These are all similar to grep, in that they take a block that evaluates some predicate test on each element of the list. Where they differ from grep, is that grep will count the total number of items in the list that returned true, whereas these four functions will simply indicate what the overall result was; allowing them to short-circuit as soon as the result is determined.

use List::Util 1.33 qw( any );

if( any { $_ == 0 } @numbers ) {
  say "The list of numbers includes zero";
}

As this module ships as part of the Perl core, it can reliably make use of the C compiler to build it, so most of the functions it contains are implemented in efficient XS code. Specifically these four also use an optimisation technique called MULTICALL, which improves the efficiency of functions of this form, where a given small block of code is repeatedly executed many times, with $_ set differently every time.

Another set of functions copied from elsewhere are the pair* functions taken and extended from List::Pairwise. These are all functions that interpret their list as an even-sized list of pairs, executing the code block with $a and $b set to the first and second value of each pair. This could be used to operate on regular perl hashes (by assigning keys to $a and their associated values to $b), though there is no requirement that it really be a hash. The functions will preserve the order of the pairs, and won't get upset if the "keys" are not plain strings, or not unique. As the result is also returned in a list of pairs, it could be assigned into a hash, or used elsewhere.

use List::Util 1.33 qw( pairgrep pairmap );

# Take a subset of a hash whose keys are ALLCAPS
my %capitals = pairgrep { $a =~ m/^[A-Z]+$/ } %hash

# Rename keys in a hash
my %renamed = pairmap { ($a =~ s/^foo_/bar_/r), $b } %hash

(This latter example also makes use of the perl 5.14 s///r flag, to return the result of the substitution instead of editing in place.)

use List::Util 1.33 qw( pairs );

foreach ( pairs %hash ) {
   # $_ will be a 2-element ARRAY ref
   say "$_->[0] has value $_->[1]";
}

As of version 1.39 (so a little too late to make it into perl 5.20.0, but still available on CPAN), pairs returns blessed array references that respond to methods called key and value (inspired by DCONWAY's Var::Pairs), as well as being accessible by array indexing.

use List::Util 1.39 qw( pairs );

foreach ( pairs %hash ) {
   say $_->key, " has value ", $_->value;
}

I have many more ideas for functions that could be added, though some care will need to be taken not to invent experimental ideas; but instead to take inspiration of tried-and-tested from CPAN, as all these have done, to bring into core and standardise existing ideas.

One other thing I have my sights set on is to implement further-optimised versions of at least some of the functions in Scalar::Util and List::Util as custom ops on perl 5.16 onwards. This will give them an even further performance boost, as they won't even be regular XS functions any more, so will completely remove the expensive call-time overhead of the ENTERSUB/LEAVESUB pair.


I should also add, that for a while now I've been a self-employed IT contractor, which has given me a lot more free time to be able to write such things as named above. If anyone is interested in supporting or sponsoring similar work on Perl, contact me by email. I'd be happy to give most reasonable Perl jobs a consideration. For that matter, I also work in C, or other languages, and I've even been known to build small-scale electronics projects.

2014/03/20

Event-Reflexivity in Static Languages

<< First | < Prev

In the previous article, we looked at the idea of explicit registration of handlers for events in an event-reflexive system, and touched on the idea that it may be more useful (or in fact, required) when dealing with a static language like C, rather than a dynamic langauge like Perl. Today's story will look at the different requirements for static languages in more detail.

In the examples in previous posts we have been able to use dynamic language features to easily implemented named actions by simply creating functions of the right name, and dispatch to them simply by passing that string name to the central dispatch functions.

# In a handler module
sub add_user
{ ... }

# In dispatching code
run_events "add_user", @args;

This became especially useful when creating dynamic action names in the IRC cases

# In a handler module
sub on_message_JOIN
{ ... }

sub on_message
{ ... }

# In dispatching code
run_events [ "on_message", "JOIN" ], @args;

To be clear here, we have used the following abilities of Perl (though similar should apply to most dynamic languages) in order to easily implement this system:

  • The ability to invoke a function in a module based on a dynamic string at runtime
  • The ability to pass a variable list of arguments to a function as a simple list, without the intermediate dispatcher function having to understand them
  • The ability to return any kind of value or list of values from a function

In a static language, this simply isn't going to work. We'll need something much stronger to bind all these pieces together. We'll need a way to more strongly identify the named actions as hook points, some way to pass the arguments around between them, and some way to interpret the return values for possible methods of combination or short-circuit return

The first idea to handle this is simply to number the events in some sort of globally-defined enumeration. But this of course creates a single global numbering, and half of the point of event-reflexivity in the first place was to avoid this kind of centralisation - a central numbering would mean that plugin modules couldn't themselves create and emit new events. They can't just invent new numbers because they might collide with existing ones.

Perhaps what is required here is that the event-reflexivity core can allocate contiguous blocks of numbered events, and allow some kind of association between event numbers and friendly string names for the convenience of programmers and users. When a new module wishes to allocate some events, it can request a block from the core, and be given its starting number. While that number would be dynamically allocated between different instances of the system, or even different runs on the same machine, it would at least be constant throughout one program run, allowing other modules to bind or invoke them. The friendly naming system would then exist to allow programs to look up the current number for a known event name, to bind or invoke it.

typedef ERcore_event_id int;

ERcore_event_id ercore_allocate_events(int n_events,
                                       const char *evname[]);
ERcore_event_id ercore_lookup_event(const char *name);
const char *ercore_event_name(ERcore_event_id id);

Having a way to create the identity of named events, we next need a way to register a function to actually handle them. This is where our second problem arises - the problem of how to pass event arguments. Perhaps the simplest is simply to allow a single void * pointer argument, on the basis that the named event would document somewhere what this was supposed to point at - likely a structure of some kind. Because C lacks the ability to create true closures, it may be necessary to pass a second pointer argument at binding time, and passing both that and the event argument to the invoked functions at dispatch time.

void ercore_bind_event(ERcore_event_id id,
                            void (*fn)(ERcore_event_id id,
                                       void *args,
                                       void *data),
                            void *data);
void ercore_run_event(ERcore_event_id,
                      void *args);

Here we've only defined the simplest of the invocation functions, run_events, because any of the others would require some consideration of the return value as well. Here we start to run into problems of needing to know how to interpret the meanings of these values. This means we can't implement any of the more interesting invocation methods as seen in the second post (Kinds of Invocation in Event-Reflexive Programming).

Our handling of arguments isn't very satisfactory, forcing invoking code to always pack their arguments into a structure, and all the handling code to unpack them from it again. We've also made our arguments totally opaque to the actual dispatch system, meaning we can't do any of the interesting tricks we could do in dynamic languages where these are visible (such as those seen in the third post, Instrumentation and Logging in Event-Reflexive Programming).

Finally, by making the event identity a simple ID number and having opaque argument structures that the event dispatch core cannot inspect, we have lost our ability to perform dynamic dispatch based on some of the arguments, as seen in the fourth post (Hierarchies of Actions in Event-Reflexive Programming).

We have seen in previous posts that all of these abilities are useful things to have, in combination they define the essential nature of what Event-Reflexivity is really all about. It would be nice if we could have these things in static languages as well as dynamic ones.

This then leads to the final, and most expansive question of this series of posts:

How do you create a powerful system of event-reflexivity in a static language such as C? How do you cope with combining return values and short-circuit evaluation in those dispatch modes that require them? How do you pass different kinds of arguments to invoked functions in a useful and simple way? And how do you perform dynamic multiple dispatch on of pieces of the event identity?

2014/03/13

Event Registration in Event-Reflexive Programming

<< First | < Prev

Continuing our recent theme of an IRC bot, the next step in the story concerns the suggestion someone once made to me that given the way most of the internals of this bot worked, it shouldn't be too hard to have these events broadcast over some kind of IPC socket or similar, between multiple processes, to allow parts of the bot to be written out-of-process. Indeed, given the right kind of serialisation, there's no reason these extra parts had to even be perl, they could be implemented in a different language.

This idea has come up twice now in two different concepts, so I decided to think about it in some more detail. In principle the idea is sound enough, but as ever the devil comes down to the details. If every event was serialised and broadcast to every listening process, the IPC overheads could get very large, because most of the time most of the processes would ignore it. The simple form of event-reflexivity we have been using up to now has relied heavily on the very cheap (virtually free) cost of introspection within the code of one process, but now we need to find a better way to implement it.

The obvious way to start this is some kind of registration system. When each process connects to the central core, it starts off telling the core which events it is interested in, perhaps in a set of strings, or regexp matches, or something. This is a good first step in cutting down plenty of unwanted noise over the serialisation links, and generally improves things. This filtering doesn't have to be perfect as each connected process can still state it isn't interested in specific events it still manages to receive, but anything we can do on the core side to cut that down will obviously help.

However, further consideration of the specific domain of interest in being an IRC bot starts to suggest we can do something more powerful. Within IRC, it's quite likely that most events of interest to plugged-in processes will concern some specific IRC channel or user. It's also quite likely that at least some plugins may be interested only in events on specific channels or users, or matching only specific text, or some other criteria. If we could get the core event distribution mechanism to filter on these as well, we can further cut down on pointless IPC overheads.

The full implications and decisions of how this might work aren't really related to event-reflexivity, but what is of interest here is that this kind of event registration system doesn't have to be only for out-of-process management. In fact, as soon as we start to consider how event-reflexive programming might be implemented in a static language like C, as compared a dynamic language like Perl, we fairly soon conclude that there must be some kind of registration call, to hook up pieces of code and help in the event dispatch process on some level or another.

This leads us on to in fact two questions this time:

How useful is it to implement event reflexivity using explicit registrations of interest in events?
Does the answer depend on whether the language is static or dynamic? Can explicit registrations provide useful abilities even in dynamic languages, or they just add unnecessary complication

Next >

2014/03/06

Hierarchies of Actions in Event-Reflexive Programming

<< First | < Prev

So far this series we have seen the introduction to event-reflexive programming, and a couple of use-cases it would appear in. This time our story continues in chronological fashion, following the development of various IRC-related systems.

The first attempt at an IRC bot was a large soup of various event-reflexive concepts, and was the experimentation bench for a lot of my first ideas about it. One pattern I found very useful was to include partly-dynamic data in with action names. That is, to use information at runtime to direct the flow of event handling. In particular in IRC, the most obvious one comes from considering the command name in incoming IRC messages.

The simplest implementation of this could be expressed something like the following, presuming our underlying IRC implementation gives us simple objects to represent each message:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;

With this simple mechanism we now have a way for each plugin to react to specifically-named IRC events, without them having to capture all the events and filter for only the ones they care about.

However, it turns out that in a number of places we actually want to capture all the messages (for example, debugging and logging). No great problem here; we can simply make a second call to a generic on_message instead and pass in the command string itself as the first argument:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;
run_plugins "on_message", $command, $message;

A pattern seems to be emerging here. We can extend this further, for example to handle the specific CTCP message verbs in IRC CTCP messages (for now, don't worry if you don't know what CTCP means; just consider that it's a second sub-hierarchy of messages):

sub on_message_PRIVMSG {
  my ($message) = @_;

  if(message is CTCP) {
    my $verb = ...;

    run_plugins "on_message_ctcp_$verb", $message;
    run_plugins "on_message_ctcp", $verb, $message;
    run_plugins "on_message", "ctcp", $verb, $message;
  }
}

However, the mechanism we've built here still seems a little unsatisfactory. Any given plugin could handle more than one of these cases, meaning it would be called multiple times. Maybe it would be better to build it such that we only call the most-specific event handler that each plugin defines. To do that we would have to build that logic right in to the basic definition of run_plugins.

One possible idea would be to pass an array reference containing pieces of event name, which should be joined by underscores (_) until a suitable handling method is found, and the remaining pieces would be passed as the first positional arguments. Thus a call such as:

run_plugins [ "on_message", "ctcp", $verb ], $message;

would invoke handlers similarly to the previous example, except that it will call at-most one action handler per plugin, meaning that specific handlers "override" more generic ones that plugin provides.

The main question of this post is therefore

To what extent should arguments be interpreted as part of the dispatch of action handlers themselves? Should some arguments be allowed to take part in forming the action name itself, to allow a degree of override-like dispatch logic on a per-plugin basis?

Next >

2014/02/27

Instrumentation and Logging in Event-Reflexive Programming

<< First | < Prev

The previous two posts in this series have introduced the idea of event-reflexive programming, and started to investigate a little into its properties, and design decisions about creating a system using it in terms of ordering considerations and ways to invoke individual actions.

I said this story would continue chronologically, following the history of this idea through various systems I've encountered or built using it. So far we have an ISP provisioning system, and an IRC bot. Quite different in size, scope and semantics, they did however have some common features. The one I want to talk about today is the debugging logging and instrumentation part.

At the ISP, every top-level provisioning action was identified by its name, a plain string, and a named set of arguments, the value of each argument also a string. This made it trivially simple to encode the action over a simple TCP socket we had at the time (this being years before the explosion of YAML and JSON-driven systems, and before, even, the peak of the XML craze). While it wasn't strictly required, it turned out that keeping this property for all of the inner reflexive events as well made logging very simple. The logging was also aware of the nesting level of the event-reflexive core, allowing it to print simple logs showing the full tree-shaped nature of the events as they took place. I forget all the inner details of the logging format, but a hypothetical example could look something like:

Action 'add_user': reseller=foo username=a-new-user product=shiney
+- Action(1) 'make_user_config': reseller=foo username=a-new-user product=shiney
+- Action(1) 'make_user_homedir': reseller=foo username=a-new-user product=shiney
|  +- Action(2) 'copy_skeleton': reseller=foo username=a-new-user product=shiney
...

It wasn't too long before I started finding this logging system simply not powerful enough. At least in the "leaf" events, these often actually did useful things - performed LDAP reads or writes, interacted with the filesystem, talked to various 3rd-party systems, and so on. The logging system then gained the ability to write these as well.

While most logging/debugging systems we have currently use a simple linear scale of "verboseness", the logged items in our logging system were tagged with any of a wide set of possible categories. The set of categories in effect at any time was given by an environment variable. For example, to log just the reads and writes to the LDAP directory, and the attributes of a write, one could set:

DEBUG_INSTRUMENT_FLAGS=LDAPr,LDAPw,LDAPwa

This becomes a much more powerful logging system because it allows the programmer/operator to choose not simply the level of verbosity of the logging, but to more finely-tune where in the code the logging is more detailed. A few current logging systems also possess this ability now.

The full power of our instrumentation system here, however, was that it was integrated with the rest of the event-reflexive core. This meant that it could peek into the action names and parameters, and test the values of them. These values could then be used to set or change the logging flags. A particularly powerful example could be:

A=add_user,reseller=foo,product=gold{LDAPw,LDAPwa};reseller=bar,user=frank{IMAP}

In this example, we are interested in logging the LDAP writes and attributes contained by them that happens during any attempt to add a new user to the foo reseller, for their gold product (perhaps because we are investigating some issue with this one), or separately, any IMAP-related activity that the bar user called frank performs. Because these are interpreted by the core of the event-reflexive system, they can apply fully recursively; applying to any nested inner action performed as part of the ultimate add_user action, for example.

Being strings, these tests could also be performed using regexps, though I never got around to implementing that test as part of this code, so they remained only simple string equality tests. However, even this far gets us an enormous amount of expressive power very simply, and virtually for free in the main body of action-handling code in the plugins, because almost all of it exists only once, at the very core. However, much of its power has come exactly because of the limited expressiveness of the individual arguments, being simple strings.

Over in my IRC bot, meanwhile, many of the arguments and return values being passed around between the named events were application-level objects expressing such concepts as IRC users, channels, and so on. These objects made it convenient to write the code powerful in the action handlers, but limited the ability of the event-reflexive core to introspect into them, to provide such abilities as detailed debug logging similar to that seen above for the ISP provisioning system.

This leads to my third question about event-reflexivity:

To what extent should the arguments of event-reflexive calls be understood and interpreted by the event-reflexive system itself, above simply passing them on to invoked action handlers? Do such abilities as powerfully flexible instrumentation logging justify limiting the expressiveness of the parameters that may be passed?

Next >