Feels like a version and/or latformspecific bug... i386 Debian with an ancient perl (5.005_03) returns 11 for the plain (non utf8) statement (don't have utf8 pm installed.)
Just toss another platform into the mix:
Activestate perl 5.8.7.build813 on XP returns 11 for the non-utf8 and
7!! for the use utf8 case (had to muck a bit with the quotes and parens to get it accepted, this is what I ran:
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "{print(length 'Cú Chullain');}"
11)
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'Cú Chullain');}"
7
Inserting extra [normal] chars in various places in the string do get noticed:
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'Cú1 1Chullain');}"
9
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'C1ú1 1Chullain');}"
10
But putting in the ú in various places it sometimes acts correct and adds 1, elsewhere I get a net negative change...and sometimes an error message. Can't see a pattern as to when/why it happens
Some samples:
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'Cú Cúhullain');}"
8
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'úCú Cúhullain');}"
5
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'úCú Cúhuúllain');}"
6
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'úCú Cúhuúllainú');}"
Malformed UTF-8 character (unexpected end of string) at -e line 1.
6
C:\Documents and Settings\tempsch.XP-TEMPSCH>perl -e "use utf8;{print(length 'úCú CúhuúllainúA');}"
Malformed UTF-8 character (unexpected end of string) at -e line 1.
6
Very funky...
I don't recall how to do it, but I think one can make perl check version and libraries version at compile time, but don't recall if you can do a conditional
use or not... beeen a while...