php - Levenshtein distance on diacritic characters -
In PHP, I'm calculating levenshtein () using the levenshtein function. It works as expected for simple characters, but for example,
Echo Levenshatin ('A', 'A') for diacritic characters; This gives "2" in this case only one replacement is to be done, so I hope that it will return "1".
Am I missing something?
default PHP levenshtein () , such as many PHP functions, multibyte aware Therefore, when processing wire with Unicode characters, it handles each byte separately and changes two bytes. There is no multibit version (i.e. mb_levenshtein () ) then you have two options: 1) mb _ Function re-implement the function by using functions: 2). If you want to calculate levenicity differences especially from Deacritic characters to non-directories, it may not be the same as what you want.
Comments
Post a Comment