LeoNerd's programming thoughts: XS beats Pure Perl

2011/07/15

XS beats Pure Perl

Someone reported some test failures trying to install Tickit, which seemed to be related to shortcomings in Text::CharWidth. The latter seems to have very poor unit test coverage on itself, so the failures didn't appear during its installation, only when Tickit::Utils was tested against it. On initial inspection I wondered if Text::CharWidth simply wasn't using wcswidth(3) correctly, and whether I should get around to my plan of rewriting bits of Tickit::Utils in XS instead for performance, as well as work around this bug.

This turned out to be quite a good idea. Implementing cols2chars() and chars2cols() in XS instead of Perl makes them at least 10 times faster. I tested it on four strings; two ASCII and two Unicode; a long and a short of each:

Calls/sec		PP	XS	Ratio
chars2cols	along	48685	406504	834.97%
chars2cols	ashort	72674	704225	969.02%
chars2cols	ulong	37341	387596	1037.99%
chars2cols	ushort	52966	714285	1348.57%
cols2chars	along	16350	403255	2466.39%
cols2chars	ashort	58685	649350	1106.50%
cols2chars	ulong	13561	362318	2671.76%
cols2chars	ushort	50556	632911	1251.90%

In fact, some cases it turns out to be 24 times faster.

I haven't looked into too much detail on why, but I suspect a large amount of the reason is to do with the way the XS functions primarily walk along the internal UTF-8 representation of the strings, counting bytes, characters, and columns as they go, and returning the appropriate count(s) when the required. The pureperl implementation doesn't have direct access to the byte offsets, so only has character numbers to work to. The frequent character-to-byte or byte-to-character conversions at all the boundaries between the functions result in multiple UTF-8 byte skip counting steps along the string each time a function is entered or left, generally slowing it down.

As to the original test failure, it turned out to be entirely unrelated lack of locale support in the platform's libc. The XS implementations fail there in the same way. But having implemented the above improvements, I decided to leave them in anyway.

XS faster than Pure Perl; who'd have thought it?

LeoNerd's programming thoughts

2011/07/15

XS beats Pure Perl

No comments:

Post a Comment

Followers

Blog Archive