P5-lingua-zh-handetect

Jul 20, 2023

Guess Chinese text’s variant and encoding

LinguaZHHanDetect uses statistical measures to test a text string to see if it’s in Traditional or Simplified Chinese, as well as which encoding it is in.

If the string does not contain Chinese characters, both the encoding and variant values will be set to the empty string.

This module is needed because the various encodings for Chinese text tend to occupy the similar byte ranges, rendering EncodeGuess ineffective.



Checkout these related ports: