Why does this release of Similar Posts, Recent Posts, Random Posts and Recent Comments get a clean new number instead of being called a beta? I guess it’s arbitrary! I can’t stop tinkering so in a sense the plugins are perpetually in beta but on the other hand bug reports are drying up (apart from the ones I keep introducing!).

I want to concentrate some more on Similar Posts for a while and try out some ideas that have emerged from reading on the subject. It is a challenge to find good matching algorithms that can be made speedy enough in PHP and not too demanding in memory usage. A few months ago I had a great system running but had to drop it when I moved from my local development server to my live hosted server which immediately froze from lack of memory.

The current release improves the Chinese/Japanese/Korean language matching by searching on bigrams instead of single characters. There are some much better algorithms in the literature but they cost too much in time and memory to be useful here. I hope the bigram scheme is good enough to be useful.

Monday Week 6 Year II

Edel McClean offers these reflections:

Readings: James 1:1-11, Psalm 118, Mark 8:11-13

The liturgical title for today is Monday of the Sixth Week of Ordinary Time. Ordinary Time. A quick look at ordinary in the dictionary tells us ‘unexceptional, plain, uninteresting’. It seems a little like what the Pharisees are accusing Jesus of in our gospel today. They seem to think he’s a little too ordinary and they come demanding a sign. Prove to us that you’re exceptional. Give us something remarkable. Do something out of the ordinary. And then we’ll believe you.

Of course, what the Pharisees were getting was anything but ordinary. They were getting a sign. They had Jesus. Standing slap bang in front of them. Not just any old preacher, but, if we follow Mark’s gospel, a man who had just healed a young child, made a deaf man hear, and fed four thousand people. And still the Pharisees say, we want more. They’re unable to see the sign right there in front of them.

The question is, I suppose, what are the signs right there in front of us? We listened to Gerard Manley Hopkins’ poem yesterday, of kingfishers and dragonflies. And we can look at moments of beauty and something in us knows they’re a sign. We catch the softness in an older person’s eye as they tell of someone they once loved, and something in us knows it’s a sign. We see a young couple stand in front of a church full of people and, with their hearts pounding, promise themselves to each other for ever. Something in us knows this is a sign. We see a child dancing barefoot, graceful and unselfconscious and we know, it’s a sign. We see a cherry tree, bursting into flower, singing and dancing its colour to the world and, if we take the time to notice, it makes our hearts sing and dance too. And we know that this too, is a sign of something beyond what we can grasp.

Older people, young couples, children, cherry trees. They all belong to ordinary time. But they’re extraordinary too. Because Jesus, it seems, has no desire to be confined to ‘special’ times, but comes to meet us, to grace our lives, right in the middle of the ordinary.

And perhaps we recognise too, those moments in ourselves. The glory of God is a person fully alive, which is another way of saying, the glory of God is a person being who God’s called them to be, which is another way of saying the glory of God is a person being fully themselves. The moment when our hearts sing. The moment when we are so fully ourselves that God shines through us. The moments when, like the old person, or the young couple, like the child or the cherry tree, we are so fully what we’re meant to be, that others look at us and see in us a sign, and know God is right there with them. What the world needs most, perhaps, is our having the courage to be ourselves – to bring our true, unique, God-given selves out from the shadows and allow them to shine. To let our very lives, which belong in this, ordinary time, to be signs of God’s grace, touching the ordinary, and setting it dancing.

Similar Posts and Pentecost

Similar Posts v.2.5b28 has just been posted.

Working on Similar Posts I have learned more than I care to know about the vagaries of MySQL, PHP, and Unicode. One particular issue that has so far resisted my attempts has been the satisfactory handling of content in Chinese, Korean, or Japanese (CJK).

Similar Posts uses the full-text indexes provided by MySQL to compare one post with another and the MySQL index is word-based. The CJK languages (I am told) are not based on discrete words — at least not words delimited by ‘white space’ — so they pose a big problem to full-text indexing.

My workaround (hack?/fiddle?/trick?) is to separate the CJK text into individual characters (while leaving single-byte encoded text alone) and use them as the basis for similarity matching. It is clearly not an ideal solution but I would love to hear from the users of WordPress blogs in Chinese, Korean, or Japanese if it is better than no solution at all.

The experiment has a couple of limitations: although not the ideal encoding for CJK languages, this method only works for now on blogs using UTF-8 encoding; also, to get around MySQL’s habit of ignoring words shorter than 4 characters long, each CJK ‘word’ is padded to that length, making for a rather large index.

To try this approach, use the Settings | Similar Posts | Manage the Index screen, set the option, and re-index. This setting overrides the other settings on that screen.

The reference to Pentecost in this post’s title is because today’s Feast celebrates the undoing of Babel.

Some Big(-gish) Changes to the Post Plugins

Since bug reports for the post plugins have dwindled to a slow drip I thought I ought to make some substantial changes in the latest version (2.5b25) and get the flow going again…

In previous versions the {image} tag did its resizing just by changing the <img> tag’s width and height attributes. Now, in addition, it serves properly resized thumbnails.

For a long time Recent Comments has allowed you to sort its output by post or by commenter and include group headings in the output list. This capacity has been generalised and extend to all the post plugins. The plugin output can now be sorted according to any output tag. For example, you might want to sort similar posts by date rather than score — to do so you sort by {date:raw}.

Recent Comments retains the ‘group_by’ option (which simply overrides any other sort options) but other sort schemes can be used.

Both the changes above probably need some cleaning up so please mess about with them and find any bugs.

Finally, I have replaced the ‘trim_before’ option with the ‘divider’ option. This will break some existing output templates I am sure but they should be even easier to fix.

