2020/12/17

2020 Perl Advent Calendar - Day 17

<< First | < Prev | Next >

For the past 16 days we've been looking at the subject of asynchronous programming, and how using async/await syntax as provided by the Future::AsyncAwait module leads to code that is much simpler and easier to read, as compared to other ways to achieve similar results. I now want to shift focus entirely, and take a look at an entirely different area - object-oriented programming.

Perl has supported object-oriented programming ever since version 5.000, though people tend to find the built-in mechanisms a little short on features. Over the years various CPAN modules have been created to fill in the missing pieces. Entire articles could be written just listing and comparing them, but Moo and Moose seem to be among the more commonly-used ones. Many of these systems are written in Perl, and thus to use them code has to be written entirely in existing Perl syntax. Even when some object systems end up being implemented in C for efficiency, they still require Perl syntax to operate them. This often leads to non-ideal behaviour.

Consider the most fundamental property of object systems: the idea that a collection of state can be bundled up into a convenient and encapsulated place, and given behaviours (which we call "methods") that can operate on that state. In classical Perl classes, we usually use a hash reference to store the state. Individual named keys can store fields of this state.

package Point;
use feature 'signatures';

sub new($class)
{
    return bless {
        x => 0,
        y => 0,
    }, $class;
}

sub move($self, $dx, $dy)
{
    $self->{x} += $dx;
    $self->{y} += $dy;
}

sub describe($self)
{
    say "A point at ($self->{x}, $self->{y})";
}

Here we have used the keys "x" and "y" inside this blessed hash reference to store state about the object instance. It's accepted convention that code outside of the object class's implementation should not interfere with these. Still, there is no enforcement of this separation, and no automation of the various parts of code that need to be written for basically any class - namely, things like the bless expression, or the $self argument of method functions.

Object systems such as Moo or Moose have popularised the idea of a has statement, at the class level, which attempts to provide some automation around these kinds of object fields. These provide a certain amount of automation of tasks like instance constructors. But they don't add much overall convenience because they are limited to only working within existing Perl syntax, and that restricts the options available for accessing instance data. The usual style is to make internal state accessible via accessor methods.

package Point;
use feature 'signatures';
use Moo;

has "x", is => "rw", default => 0;
has "y", is => "rw", default => 0;

sub move($self, $dx, $dy)
{
    $self->x($self->x + $dx);
    $self->y($self->y + $dy);
}

sub describe($self)
{
    say "A point at (", self->x, ", ", $self->y, ")";
}

This has helped in some ways (e.g. we didn't have to think about providing a constructor this time), but in other ways it feels less of an improvement. Notably, because object fields don't behave any more like regular Perl variables (as hash elements do), they can't be mutated by the convenient += operator in the move method, nor interpolated into a string in the describe method. Moreover, there is nothing about this which separates, or even suggests a difference between, the external interface of method calls that users of this class should call to access it, from the internal interface that these methods use to access the state fields directly. Users of this class are not prevented from, or even discouraged against, calling $point->x on some instance, to either read or even modify a field. This does not encourage data encapsulation.

In an attempt to fix some of these shortcomings, Ovid has been working on a design called Cor. Along with this design I have been working on an implementation of it, as the CPAN module Object::Pad.

The aim of this design is to provide new syntax as real keywords, which is therefore able to do things that none of the previous generation of object systems could do. An important feature is the way that instance data is provided.

use Object::Pad;
class Point;

has $x = 0;
has $y = 0;

method move($dx, $dy)
{
    $x += $dx;
    $y += $dy;
}

method describe()
{
    say "A point at ($x, $y)";
}

This is close to ideal in terms of code size. We have expressed all the behaviours of the previous two examples, but with a minimum of extra "noise" of exposed machinery. We didn't need to provide a constructor method, or think about a bless expression. None of our methods have had to consider a $self - either in the list of arguments provided, nor in using it to access the instance fields. The fields have been directly accessible as if they were lexical variables.

Over the next several posts, we will continue to explore this syntax module in more detail, and see its various features and advantages in more detail.

<< First | < Prev | Next >

No comments:

Post a Comment