XS beats Pure Perl

Someone reported some test failures trying to install Tickit, which seemed to be related to shortcomings in Text::CharWidth. The latter seems to have very poor unit test coverage on itself, so the failures didn't appear during its installation, only when Tickit::Utils was tested against it. On initial inspection I wondered if Text::CharWidth simply wasn't using wcswidth(3) correctly, and whether I should get around to my plan of rewriting bits of Tickit::Utils in XS instead for performance, as well as work around this bug.

This turned out to be quite a good idea. Implementing cols2chars() and chars2cols() in XS instead of Perl makes them at least 10 times faster. I tested it on four strings; two ASCII and two Unicode; a long and a short of each:


In fact, some cases it turns out to be 24 times faster.

I haven't looked into too much detail on why, but I suspect a large amount of the reason is to do with the way the XS functions primarily walk along the internal UTF-8 representation of the strings, counting bytes, characters, and columns as they go, and returning the appropriate count(s) when the required. The pureperl implementation doesn't have direct access to the byte offsets, so only has character numbers to work to. The frequent character-to-byte or byte-to-character conversions at all the boundaries between the functions result in multiple UTF-8 byte skip counting steps along the string each time a function is entered or left, generally slowing it down.

As to the original test failure, it turned out to be entirely unrelated lack of locale support in the platform's libc. The XS implementations fail there in the same way. But having implemented the above improvements, I decided to leave them in anyway.

XS faster than Pure Perl; who'd have thought it?

No comments:

Post a Comment