Archive for May 11th, 2008

Similar Posts and Pentecost

Similar Posts v.2.5b28 has just been posted.

Working on Similar Posts I have learned more than I care to know about the vagaries of MySQL, PHP, and Unicode. One particular issue that has so far resisted my attempts has been the satisfactory handling of content in Chinese, Korean, or Japanese (CJK).

Similar Posts uses the full-text indexes provided by MySQL to compare one post with another and the MySQL index is word-based. The CJK languages (I am told) are not based on discrete words — at least not words delimited by ‘white space’ — so they pose a big problem to full-text indexing.

(more…)

2 comments May 11th, 2008


Calendar

May 2008
M T W T F S S
« Apr   Jun »
 1234
567891011
12131415161718
19202122232425
262728293031  

Posts by Month

Posts by Category