2020/07/03

Some facts about `use VERSION` in Perl

As it has been getting a lot of mention lately, I thought I'd spend a few minutes to remind/inform everyone about some facts about perl's use VERSION syntax. Some of this was even new to me either today, or at least a few days ago, so I fully expect this to be informative to at least a few people and clear up some misconceptions.

  • The ability to write a version in "v-string" notation has been part of core perl since at least 5.6. [March 2000, according to Wikipedia].

    $ perl5.6.2 -e 'use v5.8;'
    Perl v5.8.0 required--this is only v5.6.2, stopped at -e line 1.
    BEGIN failed--compilation aborted at -e line 1.

    At one time these v-strings felt new and strange, and the advice was not to use them for fear of back-compat issues. At this point in time, unless you need backward compatibility to something older than perl 5.6, you have nothing to fear by using a v-string.

    I literally didn't know this until today. So far I have code on CPAN that defensively does use 5.026 but now I know this I will be changing it all to use v5.26.
  • From perl version 5.12 onwards [November 2012], the use VERSION syntax also implies use strict. Thus, if you don't need to maintain backwards compatibility to any perls older than 5.12, you can replace

    use strict;
    with
    use v5.12;

    Not only is that one character shorter to type, it also brings in the say feature. I don't know about anyone else, but I find that extremely useful in small test/debugging situations.

    v5.12 also brings in the state, switch and unicode_strings features. The latter of these does alter the behaviour of regexp patterns and other string functions with respect to unicode characters, so if you are planning to turn this on I would recommend reading perldoc feature and perldoc perlunicode to find out more about that before doing so. Quoted here in brief:

    Safest if you "use feature 'unicode_strings'"

    In order to preserve backward compatibility, Perl does not turn on full internal Unicode support unless the pragma "use feature 'unicode_strings'" is specified. (This is automatically selected if you "use 5.012" or higher.) Failure to do this can trigger unexpected surprises. See "The "Unicode Bug"" below.

  • In October 2019 an issue was raised requesting that the next use VERSION, which would be v5.34, also enable the "warnings" pragma, much as 5.12 enabled "strict":

    Issue #17162

    This was debated for a while, with the current prevailing opinion being to not do this because enabling "use warnings" just because someone requested "use v5.32" was considered too radical and surprising:

    comment by @iabyn

    comment by wagnerc

2019/09/06

Perl Parser Plugins 3a - The Stack

<< First | < Prev | Next >

In the previous article we looked at the implementation of a keyword that acts like a constant, providing a new value tau. We briefly saw how it is implemented with an op called OP_CONST, which pushes a value to "the stack". Lets now look in more detail at what we mean by operators having an effect on the stack.

For example, during the execution of a piece of code such as

  my $double = $x * 2;

the value stored in $x and the constant 2 get pushed to the value stack, so that the OP_MULTIPLY operator can multiply them, leaving the product on the stack. If the $x lexical held the value 5 before this code is executed, then the stack would see the following activity:

    |         |
    | (empty) |

  -> OP_PADSV; OP_CONST ->

    | IV=2 |
    | IV=5 |

  -> OP_MULTIPLY ->

    |       |
    | IV=10 |

Elements on the value stack are SV pointers, SV *, and refer directly to the SVs involved. As the stack points to SVs, it doesn't need to only store temporary values, but can refer to any SV in the perl interpreter. As the statement continues, it fetches the address of the SV representing the $double lexical from the pad and pushes that to the stack as well, so that the OP_SASSIGN operator can pop them both to see what to assign to where.

    |       |
    | IV=10 |

  -> OP_PADSV ->

    | (addr of $double) |
    | IV=10             |

  -> OP_SASSIGN ->

    |         |
    | (empty) |    $double now contains IV=10

We see here that at the end of the statement the stack has become empty again. This is its usual state - the stack stores temporary values of expressions evaluated during a single statement, but normally it is empty between statements.

In XS or core perl interpreter code, basic SV values are pushed to the stack using the PUSHs() macro, which takes an SV. There are also convenience macros which construct new SVs to contain integers, floating-point values, strings, etc; they are named PUSHi(), PUSHn(), PUSHp(), and so on.

  SV *sv = ...
  PUSHs(sv);

Note that pointers on the value stack are not reference counted - this is rare in Perl. Normally any SV pointer that points to an SV contributes to the value of the SvREFCNT() of that SV, but in the case of stack operations there is a performance benefit of not having to adjust the count all the time. For this reason, code that operates on stack values needs to be careful to mortalise or otherwise handle reference counting issues. There is an entire second set of push macros that mortalize the pushed value by calling sv_2mortal() on it. they are named with a prefixed m - mPUSHs(), mPUSHi(), etc.

  SV *tmp_sv = ...
  mPUSHs(tmp_sv);

  mPUSHi(123);

  mPUSHp("Hello, world", 12);

The stack is stored as a single contiguous array of SV pointers, whose base is given by the PL_stack_base interpreter variable. Again, there is a performance benefit from not actually checking the size of the stack before pushing a value to it, as many operators result in an overall reduction in the number of elements on the stack (for example, the OP_MULTIPLY described above). The EXTEND macro checks that there is room for at least the given number of elements and is commonly used in combination with the PUSH macros when returning a fixed-sized list of values.

  EXTEND(SP, 3);
  mPUSHi(t->tm_hour);
  mPUSHi(t->tm_min);
  mPUSHi(t->tm_sec);

There is an entire alternate set of push macros that extend the stack before pushing, in both regular and mortalizing variants; they are named XPUSH... and mXPUSH.... In general it is better to EXTEND once and use the non-X--variants, but at times when the required size is not known upfront these variants can be handy.

  while(p) {
    mXPUSHi(p->value);
    p = p->next;
  }

In core perl interpreter code, many operators inspect values on the top of the stack and remove them using the TOPs and POPs macros. TOPs simply returns the SV pointer at the top of the stack, and POPs removes it; being the inverse of PUSHs. As with the push macro, there are convenient type-converting macros here too which directly yield integers, floating-points, and so on from the top of the stack; they are named POPi, POPn, etc...

For example, a simple version of the OP_MULTIPLY operator could be implemented by a C function that performs the following:

  NV left = POPn;
  NV right = POPn;
  mPUSHn(left * right);

In actual fact the real OP_MULTIPLY has to handle a lot of other cases such as operator overloading and the various kinds of SV that might be found, but at its core this is the basic principle.

<< First | < Prev | Next >

2019/08/16

async/await in Perl 5 and Dart

Dart allows programmers to write programs in an asynchronous style, in order to achieve higher performance through concurrency. Object types like Future and language features like the async/await syntax mean that asynchronous functions can be written in a natural way that reads similar to straight-line code. The function can be suspended midway through execution until a result arrives that allows it to continue.

Here is an example which shows how you can use these to implement a function that suspends in the middle until it has received a response to the HTTP request we sent. We request a JSON-encoded list of numbers, and return their sum:

  import 'dart:convert' as convert;
  import 'package:http/http.dart' as http;

  Future<int> getSumFromUrl(String url) async {
    var response = await http.get(url);
    var data = convert.jsonDecode(response.body);

    return data['numbers'].reduce((a, b) => a + b);
  }

We can write a similar thing in Perl 5. In Dart the event system is built into the language, whereas in Perl 5 we get to choose our own. Because of this, the example is a little more verbose because it has to specify more of these choices - creating the IO loop and adding the HTTP client to it.

  use Future::AsyncAwait;

  use IO::Async::Loop;
  use Net::Async::HTTP;

  use JSON::MaybeXS 'decode_json';
  use List::Util 'sum';

  my $loop = IO::Async::Loop->new;

  my $http = Net::Async::HTTP->new;
  $loop->add( $http );

  async sub get_sum_from_url($url)
  {
    my $response = await $http->GET( $url );
    my $data = decode_json( $response->content );

    return sum( $data->{numbers}->@* );
  }

The examples in both languages make use of a type of object that wraps up the idea of "an operation that may still be pending" - which both languages call a Future. While minor differences exist between the two languages - such as the methods on them - the overall idea remains the same. Essentially, the value is a placeholder for a result that will come later.

In both languages we see the await keyword, which operates on an expression. The argument to the await keyword is a value of one of these future objects. The await keyword is used to suspend the currently-running function until that result is available. Once the result arrives, the await expression itself yields that deferred result.

Similarly, in both languages the async keyword decorates a function declaration and remarks that it may return its own result asynchronously via one of these futures, and allows that function to make use of the await expression.

This similarity is no coïncidence. The Future::AsyncAwait module which adds the async/await syntax to Perl 5 was designed specifically to look and feel very similar to this feature in several other languages - of which Dart is one.

This async/await syntax makes the code read similarly to how it would look if we were not using futures to make it asynchronous, but instead just using the return values of functions directly. This similarity of notation is the reason why we prefer to use the await syntax if we can, as it helps readability of the code. Compare this syntax with earlier techniques - such as callback functions - where the structure of the code can often look very different.

By providing the same (or at least similar) semantics behind the same kind of notation, each language retains a sense of familiarity to users of other languages. It allows readers to make more sense of the program at first glance because the same sorts of structures with the same sorts of behaviour exist there too. By sharing these ideas, each ecosystem gains the strengths of those ideas it borrows from the other, to the overall benefit of both.

async/await in Perl 5 and C# 5

C# 5 allows programmers to write programs in an asynchronous style, in order to achieve higher performance through concurrency. Object types like Task and language features like the async/await syntax mean that asynchronous functions can be written in a natural way that reads similar to straight-line code. The function can be suspended midway through execution until a result arrives that allows it to continue.

Here is an example which shows how you can use these to implement a function that suspends in the middle until it has received a response to the HTTP request we sent. We request a JSON-encoded list of numbers, and return their sum:

  using System.Collections.Generic;
  using System.Net.Http;
  using System.Web.Script.Serialization;

  class ExampleSchema
  {
    public List<int> numbers { get; set; }
  }

  public async Task<int> getSumFromUrl(string url)
  {
    using (HttpClient client = new HttpClient()) {
      string response = await client.GetStringAsync(url);

      ExampleSchema data = new JavaScriptSerializer()
        .Deserialize<ExampleSchema>(response);

      return data.numbers.Sum();
    }
  }

We can write a similar thing in Perl 5. In C# 5 the event system is built into the language, whereas in Perl 5 we get to choose our own. Because of this, the example is a little more verbose because it has to specify more of these choices - creating the IO loop and adding the HTTP client to it.

  use Future::AsyncAwait;

  use IO::Async::Loop;
  use Net::Async::HTTP;

  use JSON::MaybeXS 'decode_json';
  use List::Util 'sum';

  my $loop = IO::Async::Loop->new;

  my $http = Net::Async::HTTP->new;
  $loop->add( $http );

  async sub get_sum_from_url($url)
  {
    my $response = await $http->GET( $url );
    my $data = decode_json( $response->content );

    return sum( $data->{numbers}->@* );
  }

The examples in both languages make use of a type of object that wraps up the idea of "an operation that may still be pending". In C# 5 that is a value of Task type; in Perl 5 it is a Future. While minor differences exist between the two languages - such as the names of the types or methods on them - the overall idea remains the same. Essentially, the value is a placeholder for a result that will come later.

In both languages we see the await keyword, which operates on an expression. The argument to the await keyword is a value of one of these deferred results - a task or future. The await keyword is used to suspend the currently-running function until that result is available. Once the result arrives, the await expression itself yields that deferred result.

Similarly, in both languages the async keyword decorates a function declaration and remarks that it may return its own result asynchronously via one of these deferred-result values (a Task or Future), and allows that function to make use of the await expression.

This similarity is no coïncidence. The Future::AsyncAwait module which adds the async>/await syntax to Perl 5 was designed specifically to look and feel very similar to this feature in several other languages - of which C# 5 is one.

This async/await syntax makes the code read similarly to how it would look if we were not using tasks or futures to make it asynchronous, but instead just using the return values of functions directly. This similarity of notation is the reason why we prefer to use the await syntax if we can, as it helps readability of the code. Compare this syntax with earlier techniques - such as callback functions - where the structure of the code can often look very different.

By providing the same (or at least similar) semantics behind the same kind of notation, each language retains a sense of familiarity to users of other languages. It allows readers to make more sense of the program at first glance because the same sorts of structures with the same sorts of behaviour exist there too. By sharing these ideas, each ecosystem gains the strengths of those ideas it borrows from the other, to the overall benefit of both.

2019/08/15

async/await in Perl 5 and Python 3

Python 3 allows programmers to write programs in an asynchronous style, in order to achieve higher performance through concurrency. Object types like Future and language features like the async/await syntax mean that asynchronous functions can be written in a natural way that reads similar to straight-line code. The function can be suspended midway through execution until a result arrives that allows it to continue.

Here is an example which shows how you can use these to implement a function that suspends in the middle until it has received a response to the HTTP request we sent. We request a JSON-encoded list of numbers, and return their sum:

  import aiohttp

  async def get_sum_from_url(url):
    async with aiohttp.ClientSession() as session:
      async with session.get(url) as response:
        data = await response.json()
        return sum(data.numbers)

We can write a similar thing in Perl 5. In Python 3 the event system is built into the language, whereas in Perl 5 we get to choose our own. Because of this, the example is a little more verbose because it has to specify more of these choices - creating the IO loop and adding the HTTP client to it.

  use Future::AsyncAwait;

  use IO::Async::Loop;
  use Net::Async::HTTP;

  use JSON::MaybeXS 'decode_json';
  use List::Util 'sum';

  my $loop = IO::Async::Loop->new;

  my $http = Net::Async::HTTP->new;
  $loop->add( $http );

  async sub get_sum_from_url($url)
  {
    my $response = await $http->GET( $url );
    my $data = decode_json( $response->content );

    return sum( $data->{numbers}->@* );
  }

The examples in both languages make use of a type of object that wraps up the idea of "an operation that may still be pending" - which both languages call a Future. While minor differences exist between the two languages - such as the methods on them - the overall idea remains the same. Essentially, the value is a placeholder for a result that will come later.

In both languages we see the await keyword, which operates on an expression. The argument to the await keyword is a value of one of these future objects. The await keyword is used to suspend the currently-running function until that result is available. Once the result arrives, the await expression itself yields that deferred result.

Similarly, in both languages the async keyword decorates a function declaration and remarks that it may return its own result asynchronously via one of these futures, and allows that function to make use of the await expression.

This similarity is no coïncidence. The Future::AsyncAwait module which adds the async/await syntax to Perl 5 was designed specifically to look and feel very similar to this feature in several other languages - of which Python 3 is one.

This async/await syntax makes the code read similarly to how it would look if we were not using futures to make it asynchronous, but instead just using the return values of functions directly. This similarity of notation is the reason why we prefer to use the await syntax if we can, as it helps readability of the code. Compare this syntax with earlier techniques - such as callback functions - where the structure of the code can often look very different.

By providing the same (or at least similar) semantics behind the same kind of notation, each language retains a sense of familiarity to users of other languages. It allows readers to make more sense of the program at first glance because the same sorts of structures with the same sorts of behaviour exist there too. By sharing these ideas, each ecosystem gains the strengths of those ideas it borrows from the other, to the overall benefit of both.

async/await in Perl 5 and ECMAScript 6

ECMAScript 6 allows programmers to write programs in an asynchronous style, in order to achieve higher performance through concurrency. Object types like Promise and language features like the async/await syntax mean that asynchronous functions can be written in a natural way that reads similar to straight-line code. The function can be suspended midway through execution until a result arrives that allows it to continue.

Here is an example which shows how you can use these to implement a function that suspends in the middle until it has received a response to the HTTP request we sent. We request a JSON-encoded list of numbers, and return their sum:

  const fetch = require("node-fetch");

  async function getSumFromUrl(url) {
    const response = await fetch(url);
    const data = await response.json();

    return data.numbers.reduce((a, b) => a + b, 0);
  }

We can write a similar thing in Perl 5. In ECMAScript 6 the event system is built into the language, whereas in Perl 5 we get to choose our own. Because of this, the example is a little more verbose because it has to specify more of these choices - creating the IO loop and adding the HTTP client to it.

  use Future::AsyncAwait;

  use IO::Async::Loop;
  use Net::Async::HTTP;

  use JSON::MaybeXS 'decode_json';
  use List::Util 'sum';

  my $loop = IO::Async::Loop->new;

  my $http = Net::Async::HTTP->new;
  $loop->add( $http );

  async sub get_sum_from_url($url)
  {
    my $response = await $http->GET( $url );
    my $data = decode_json( $response->content );

    return sum( $data->{numbers}->@* );
  }

The examples in both languages make use of a type of object that wraps up the idea of "an operation that may still be pending". In ECMAScript 6 that is a value of Promise type; in Perl 5 it is a Future. While minor differences exist between the two languages - such as the names of the types or methods on them - the overall idea remains the same. Essentially, the value is a placeholder for a result that will come later.

In both languages we see the await keyword, which operates on an expression. The argument to the await keyword is a value of one of these deferred results - a promise or future. The await keyword is used to suspend the currently-running function until that result is available. Once the result arrives, the await expression itself yields that deferred result.

Similarly, in both languages the async keyword decorates a function declaration and remarks that it may return its own result asynchronously via one of these deferred-result values (a Promise or Future), and allows that function to make use of the await expression.

This similarity is no coïncidence. The Future::AsyncAwait module which adds the async/await syntax to Perl 5 was designed specifically to look and feel very similar to this feature in several other languages - of which ECMAScript 6 is one.

This async/await syntax makes the code read similarly to how it would look if we were not using promises or futures to make it asynchronous, but instead just using the return values of functions directly. This similarity of notation is the reason why we prefer to use the await syntax if we can, as it helps readability of the code. Compare this syntax with earlier techniques - such as callback functions - where the structure of the code can often look very different.

By providing the same (or at least similar) semantics behind the same kind of notation, each language retains a sense of familiarity to users of other languages. It allows readers to make more sense of the program at first glance because the same sorts of structures with the same sorts of behaviour exist there too. By sharing these ideas, each ecosystem gains the strengths of those ideas it borrows from the other, to the overall benefit of both.

2019/06/10

Building for new ATtiny 1-series chips on Debian

In 2018, Microchip released a new range of ATtiny microcontroller chips, called the "ATtiny 1-series" - presumably named from the naming pattern of the part numbers. In usual Atmel (now bought by Microchip) style, the first digit(s) of the part number give the size of the flash memory; the remaining give an indication of the size and featureset of the chip.

ATtinyX128 pin package, 5/6 IO pinsATtiny212, ATtiny412
ATtinyX1414 pin package, 11/12 IO pinsATtiny214, ATtiny414, ATtiny814, ATtiny1614
ATtinyX1620 pin package, 17/18 IO pinsATtiny416, ATtiny816, ATtiny1616, ATtiny3216
ATtinyX1724 pin package, 21/22 IO pinsATtiny417, ATtiny817, ATtiny1617, ATtiny3217

I'll write more about these new chips in another post - there's much change from the older style of ATtiny chips you may be familiar with. Many new things added, things improved, as well as a couple of - in my opinion - backward steps.

This post is largely a reminder to myself, and a help to anyone else, on how to build code for these new chips. The trouble is that they're newer than the avr-libc support package in Debian, meaning that you can't actually build code for these yet. Such an attempt will fail:

$ avr-gcc -std=gnu99 -Wall -Os -DF_CPU=20000000 -mmcu=attiny814 -flto -ffunction-sections -fshort-enums -o .build/firmware.elf src/main.c
/usr/lib/avr/include/avr/io.h:625:6: warning: #warning "device type not defined" [-Wcpp]
 #    warning "device type not defined"
      ^
In file included from src/main.c:4:0:
src/main.c: In function ‘RTC_PIT_vect’:
src/main.c:33:5: warning: ‘RTC_PIT_vect’ appears to be a misspelled signal handler, missing __vector prefix [-Wmisspelled-isr]
 ISR(RTC_PIT_vect)
     ^
src/main.c:35:3: error: ‘RTC’ undeclared (first use in this function)
   RTC.PITINTFLAGS = RTC_PI_bm;
   ^
...

This is caused by the fact that, while avr-gcc has support for the chips, the various support files that should be provided by avr-libc are missing. I've reported a Debian bug about this. Until it's fixed, however, it's easy enough to work around by providing the missing files.

Start off by downloading the "Atmel ATtiny Series Device Support" file from http://packs.download.atmel.com/. This is a free and open download, licensed under Apache v2. This file carries the extension atpack but it's actually just a ZIP file:

$ file Atmel.ATtiny_DFP.1.3.229.atpack 
Atmel.ATtiny_DFP.1.3.229.atpack: Zip archive data, at least v1.0 to extract

Note that by default it'll unpack into the working directory, so you'll want to create a temporary folder to work in:

$ mkdir pack

$ cd pack/

$ unzip ~/Atmel.ATtiny_DFP.1.3.229.atpack 
Archive:  /home/leo/Atmel.ATtiny_DFP.1.3.229.atpack
   creating: atdf/
   creating: avrasm/
   creating: avrasm/inc/
...

From here, you can now copy the relevant files out to where avr-gcc will find them:

$ sudo cp include/avr/iotn?*1[2467].h /usr/lib/avr/include/avr/
$ sudo cp gcc/dev/attiny?*1[2467]/avrxmega3/*.{o,a} /usr/lib/avr/lib/avrxmega3/
$ sudo cp gcc/dev/attiny?*1[2467]/avrxmega3/short-calls/*.{o,a} /usr/lib/avr/lib/avrxmega3/short-calls/

Finally, there's one last task that needs doing. Locate the main avr/io.h file (it should live in /usr/lib/avr/include) and add the following lines somewhere within the main block of similar lines. These are needed to redirect from the toplevel #include <avr/io.h> towards the device-specific file.

#elif defined (__AVR_ATtiny212__)
#  include <avr/iotn212.h>
#elif defined (__AVR_ATtiny412__)
#  include <avr/iotn412.h>
#elif defined (__AVR_ATtiny214__)
#  include <avr/iotn214.h>
#elif defined (__AVR_ATtiny414__)
#  include <avr/iotn414.h>
#elif defined (__AVR_ATtiny814__)
#  include <avr/iotn814.h>
#elif defined (__AVR_ATtiny1614__)
#  include <avr/iotn1614.h>
#elif defined (__AVR_ATtiny3214__)
#  include <avr/iotn3214.h>
#elif defined (__AVR_ATtiny416__)
#  include <avr/iotn416.h>
#elif defined (__AVR_ATtiny816__)
#  include <avr/iotn816.h>
#elif defined (__AVR_ATtiny1616__)
#  include <avr/iotn1616.h>
#elif defined (__AVR_ATtiny3216__)
#  include <avr/iotn3216.h>
#elif defined (__AVR_ATtiny417__)
#  include <avr/iotn417.h>
#elif defined (__AVR_ATtiny817__)
#  include <avr/iotn817.h>
#elif defined (__AVR_ATtiny1617__)
#  include <avr/iotn1617.h>
#elif defined (__AVR_ATtiny3217__)
#  include <avr/iotn3217.h>

Having done this we find we can now compile firmware for these new chips:

avr-gcc -std=gnu99 -Wall -Os -DF_CPU=20000000 -mmcu=attiny814 -flto -ffunction-sections -fshort-enums -o .build/firmware.elf src/main.c
avr-size .build/firmware.elf
   text    data     bss     dec     hex filename
   3727      30     105    3862     f16 .build/firmware.elf
avr-objcopy -j .text -j .rodata -j .data -O ihex .build/firmware.elf firmware-flash.hex

Next post I'll write more about my opinions on these chips, highlighting some of the newer features and changes.