2020/12/09

2020 Perl Advent Calendar - Day 9

<< First | < Prev | Next >

Yesterday we saw some ways to write concurrent asynchronous code which waits on a few different tasks to complete. Sometimes we want to do the same thing multiple times concurrently but with different data each time. Often it's the case that each item can be processed independently of the others, so it makes sense to try to do several at once.

One approach here is to simply await a future to process each individual item inside a regular foreach loop. This will only work on one item at once, so we won't make use of concurrency.

## A poor idea for iterating a list ##
use Future::AsyncAwait;

foreach my $item (@ITEMS) {
    await PROCESS($item);
}

Another idea is to use map to apply an asynchronous function to every item in the list, and thus start all the items at once, then wait on all those futures using needs_all. This may end up being too concurrent - if the list contained many thousand items we might do too many at once and overload whatever external service or system we are talking to.

## Another poor idea ##
use Future::AsyncAwait;

await Future->needs_all(
    map { PROCCESS($_) } @ITEMS
);

A better middle-ground between these two extremes was introduced in the original blog series on day 18, in the form of the Future::Utils::fmap collection of helper functions. The basic idea of fmap is that it is a future-aware equivalent of Perl's map operator. The fmap function is given a block of code which is expected to return a future, and a list of items. It invokes the code block once for each item in the list, collecting up and waiting on the returned future values, until all the items are done. fmap returns a future to represent the entire operation, which will complete with the results of each individual item.

The fmap family actually contains three individual functions, which all operate in the same basic manner. The difference between them is in how they handle return values from each individual item block - fmap_concat can handle an entire list from each item and concatenates all the results together for its overall result, fmap_scalar expects exactly one result per item, and fmap_void does not collect up any results at all; running the code block simply for its side-effects.

Since fmap expects a future-returning function and itself returns a future it is also idea for use with async/await syntax. It can be invoked in an await expression and passed an async sub to operate on.

This somewhat-paraphrased example uses the GET method of a Net::Async::HTTP user agent object to concurrently fetch and JSON-decode a collection of data from multiple API endpoints of some remote service.

use feature 'signatures';
use Future::AsyncAwait;
use Future::Utils qw( fmap_scalar );
use JSON::MaybeUTF8 qw( decode_json_utf8 );

use Net::Async::HTTP;
my $ua = Net::Async::HTTP->new;
...

my @urls = ...

my @data = await fmap_scalar(async sub ($url) {
    my $response = await $ua->GET($url);
    return decode_json_utf8($response->content);
}, foreach => \@urls, concurrent => 16);

A few things should be noted about this example. First is that the async sub syntax is used explicitly to create an asynchronous function to pass as the first argument to the fmap_scalar function. Second is the use of the concurrent parameter, telling the function how many items to keep running concurrently. Finally, the list of items has to be passed in an array reference rather than a plain flat list.

These facts all come about because the fmap functions are just plain Perl functions and not special syntax, as opposed to the real syntax provided by the async and await keywords. Whereas these two keywords were inspired by a whole collection of other languages which have all adopted it as a standard pattern, there is not much existing design on the problem of bounded concurrency map-style syntax. This particular area remains a matter of ongoing design and discussion. Thoughts welcome ;)

<< First | < Prev | Next >

No comments:

Post a Comment