Well not just letters of the alphabet it seems.
Take the case of the logstash pattern WORD:
WORD \b\w+\b
but the shorthand character class \w matches [a-zA-Z0-9_] – notice the digits and underscore! So WORD is not really a WORD!
REALWORD \b[a-zA-Z]+\b
would be better … although I suppose things might be different in Unicode. But generally log files may be Unicode but frequently the data itself is still effectively ASCII.