LeoNerd's programming thoughts: 2014/03

2014/03/20

Event-Reflexivity in Static Languages

In the previous article, we looked at the idea of explicit registration of handlers for events in an event-reflexive system, and touched on the idea that it may be more useful (or in fact, required) when dealing with a static language like C, rather than a dynamic langauge like Perl. Today's story will look at the different requirements for static languages in more detail.

In the examples in previous posts we have been able to use dynamic language features to easily implemented named actions by simply creating functions of the right name, and dispatch to them simply by passing that string name to the central dispatch functions.

# In a handler module
sub add_user
{ ... }

# In dispatching code
run_events "add_user", @args;

This became especially useful when creating dynamic action names in the IRC cases

# In a handler module
sub on_message_JOIN
{ ... }

sub on_message
{ ... }

# In dispatching code
run_events [ "on_message", "JOIN" ], @args;

To be clear here, we have used the following abilities of Perl (though similar should apply to most dynamic languages) in order to easily implement this system:

The ability to invoke a function in a module based on a dynamic string at runtime
The ability to pass a variable list of arguments to a function as a simple list, without the intermediate dispatcher function having to understand them
The ability to return any kind of value or list of values from a function

In a static language, this simply isn't going to work. We'll need something much stronger to bind all these pieces together. We'll need a way to more strongly identify the named actions as hook points, some way to pass the arguments around between them, and some way to interpret the return values for possible methods of combination or short-circuit return

The first idea to handle this is simply to number the events in some sort of globally-defined enumeration. But this of course creates a single global numbering, and half of the point of event-reflexivity in the first place was to avoid this kind of centralisation - a central numbering would mean that plugin modules couldn't themselves create and emit new events. They can't just invent new numbers because they might collide with existing ones.

Perhaps what is required here is that the event-reflexivity core can allocate contiguous blocks of numbered events, and allow some kind of association between event numbers and friendly string names for the convenience of programmers and users. When a new module wishes to allocate some events, it can request a block from the core, and be given its starting number. While that number would be dynamically allocated between different instances of the system, or even different runs on the same machine, it would at least be constant throughout one program run, allowing other modules to bind or invoke them. The friendly naming system would then exist to allow programs to look up the current number for a known event name, to bind or invoke it.

typedef ERcore_event_id int;

ERcore_event_id ercore_allocate_events(int n_events,
                                       const char *evname[]);
ERcore_event_id ercore_lookup_event(const char *name);
const char *ercore_event_name(ERcore_event_id id);

Having a way to create the identity of named events, we next need a way to register a function to actually handle them. This is where our second problem arises - the problem of how to pass event arguments. Perhaps the simplest is simply to allow a single void * pointer argument, on the basis that the named event would document somewhere what this was supposed to point at - likely a structure of some kind. Because C lacks the ability to create true closures, it may be necessary to pass a second pointer argument at binding time, and passing both that and the event argument to the invoked functions at dispatch time.

void ercore_bind_event(ERcore_event_id id,
                            void (*fn)(ERcore_event_id id,
                                       void *args,
                                       void *data),
                            void *data);
void ercore_run_event(ERcore_event_id,
                      void *args);

Here we've only defined the simplest of the invocation functions, run_events, because any of the others would require some consideration of the return value as well. Here we start to run into problems of needing to know how to interpret the meanings of these values. This means we can't implement any of the more interesting invocation methods as seen in the second post (Kinds of Invocation in Event-Reflexive Programming).

Our handling of arguments isn't very satisfactory, forcing invoking code to always pack their arguments into a structure, and all the handling code to unpack them from it again. We've also made our arguments totally opaque to the actual dispatch system, meaning we can't do any of the interesting tricks we could do in dynamic languages where these are visible (such as those seen in the third post, Instrumentation and Logging in Event-Reflexive Programming).

Finally, by making the event identity a simple ID number and having opaque argument structures that the event dispatch core cannot inspect, we have lost our ability to perform dynamic dispatch based on some of the arguments, as seen in the fourth post (Hierarchies of Actions in Event-Reflexive Programming).

We have seen in previous posts that all of these abilities are useful things to have, in combination they define the essential nature of what Event-Reflexivity is really all about. It would be nice if we could have these things in static languages as well as dynamic ones.

This then leads to the final, and most expansive question of this series of posts:

How do you create a powerful system of event-reflexivity in a static language such as C? How do you cope with combining return values and short-circuit evaluation in those dispatch modes that require them? How do you pass different kinds of arguments to invoked functions in a useful and simple way? And how do you perform dynamic multiple dispatch on of pieces of the event identity?

2014/03/13

Event Registration in Event-Reflexive Programming

<< First | < Prev

Continuing our recent theme of an IRC bot, the next step in the story concerns the suggestion someone once made to me that given the way most of the internals of this bot worked, it shouldn't be too hard to have these events broadcast over some kind of IPC socket or similar, between multiple processes, to allow parts of the bot to be written out-of-process. Indeed, given the right kind of serialisation, there's no reason these extra parts had to even be perl, they could be implemented in a different language.

This idea has come up twice now in two different concepts, so I decided to think about it in some more detail. In principle the idea is sound enough, but as ever the devil comes down to the details. If every event was serialised and broadcast to every listening process, the IPC overheads could get very large, because most of the time most of the processes would ignore it. The simple form of event-reflexivity we have been using up to now has relied heavily on the very cheap (virtually free) cost of introspection within the code of one process, but now we need to find a better way to implement it.

The obvious way to start this is some kind of registration system. When each process connects to the central core, it starts off telling the core which events it is interested in, perhaps in a set of strings, or regexp matches, or something. This is a good first step in cutting down plenty of unwanted noise over the serialisation links, and generally improves things. This filtering doesn't have to be perfect as each connected process can still state it isn't interested in specific events it still manages to receive, but anything we can do on the core side to cut that down will obviously help.

However, further consideration of the specific domain of interest in being an IRC bot starts to suggest we can do something more powerful. Within IRC, it's quite likely that most events of interest to plugged-in processes will concern some specific IRC channel or user. It's also quite likely that at least some plugins may be interested only in events on specific channels or users, or matching only specific text, or some other criteria. If we could get the core event distribution mechanism to filter on these as well, we can further cut down on pointless IPC overheads.

The full implications and decisions of how this might work aren't really related to event-reflexivity, but what is of interest here is that this kind of event registration system doesn't have to be only for out-of-process management. In fact, as soon as we start to consider how event-reflexive programming might be implemented in a static language like C, as compared a dynamic language like Perl, we fairly soon conclude that there must be some kind of registration call, to hook up pieces of code and help in the event dispatch process on some level or another.

This leads us on to in fact two questions this time:

How useful is it to implement event reflexivity using explicit registrations of interest in events?

Does the answer depend on whether the language is static or dynamic? Can explicit registrations provide useful abilities even in dynamic languages, or they just add unnecessary complication

Next >

2014/03/06

Hierarchies of Actions in Event-Reflexive Programming

<< First | < Prev

So far this series we have seen the introduction to event-reflexive programming, and a couple of use-cases it would appear in. This time our story continues in chronological fashion, following the development of various IRC-related systems.

The first attempt at an IRC bot was a large soup of various event-reflexive concepts, and was the experimentation bench for a lot of my first ideas about it. One pattern I found very useful was to include partly-dynamic data in with action names. That is, to use information at runtime to direct the flow of event handling. In particular in IRC, the most obvious one comes from considering the command name in incoming IRC messages.

The simplest implementation of this could be expressed something like the following, presuming our underlying IRC implementation gives us simple objects to represent each message:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;

With this simple mechanism we now have a way for each plugin to react to specifically-named IRC events, without them having to capture all the events and filter for only the ones they care about.

However, it turns out that in a number of places we actually want to capture all the messages (for example, debugging and logging). No great problem here; we can simply make a second call to a generic on_message instead and pass in the command string itself as the first argument:

my $message = ...;
my $command = $message->command;

run_plugins "on_message_$command", $message;
run_plugins "on_message", $command, $message;

A pattern seems to be emerging here. We can extend this further, for example to handle the specific CTCP message verbs in IRC CTCP messages (for now, don't worry if you don't know what CTCP means; just consider that it's a second sub-hierarchy of messages):

sub on_message_PRIVMSG {
  my ($message) = @_;

  if(message is CTCP) {
    my $verb = ...;

    run_plugins "on_message_ctcp_$verb", $message;
    run_plugins "on_message_ctcp", $verb, $message;
    run_plugins "on_message", "ctcp", $verb, $message;
  }
}

However, the mechanism we've built here still seems a little unsatisfactory. Any given plugin could handle more than one of these cases, meaning it would be called multiple times. Maybe it would be better to build it such that we only call the most-specific event handler that each plugin defines. To do that we would have to build that logic right in to the basic definition of run_plugins.

One possible idea would be to pass an array reference containing pieces of event name, which should be joined by underscores (_) until a suitable handling method is found, and the remaining pieces would be passed as the first positional arguments. Thus a call such as:

run_plugins [ "on_message", "ctcp", $verb ], $message;

would invoke handlers similarly to the previous example, except that it will call at-most one action handler per plugin, meaning that specific handlers "override" more generic ones that plugin provides.

The main question of this post is therefore

To what extent should arguments be interpreted as part of the dispatch of action handlers themselves? Should some arguments be allowed to take part in forming the action name itself, to allow a degree of override-like dispatch logic on a per-plugin basis?