Yes - this is how it is implemented, but IMO this is a bug.
The description clearly states that it should return title text i.e. mixed case and not just capitalise the first letter of a word.
I wasn’t sure why it does not just use the Python3 string
title() method (which handles Unicode) until I tested it with “shamboni’s”.
I am also unsure exactly whether the
iswbound function is correct or not - but at present I am guessing not. You shouldn’t normally get e.g. paragraph boundary strings in a Picard variable (but it could e.g. be in lyrics), but if you do they should be treated as a word boundary. But
iswbound checks for the Unicode modifier symbol and treats it as a word boundary which IMO it shouldn’t - and in fact I wonder how modifier symbols should be treated and indeed whether the character-by-character algorithm will work properly with unicode when a single unicode character can actually be several individual characters with modifiers.
If I was coding this from scratch, I would use the Python3
title method, and then use a regex replacement to find locations with a quotation preceded by a unicode-letter and followed by a unicode upper-case or Title-case (digraphic with first part uppercase) letter and replace the latter with a lower-case equivalent.
Picard does have some unit test functions for $title, but clearly they are not good enough. IMO Picard also needs a slew of obscure unit tests for string functions like $title to check they are handling unicode correctly for all languages and for obscure unicode functionality like digraphics and modifiers.
Apologies, but I don’t have time to fix this at present.