Similar Posts: WordPress Plugin

Updated

13th January, 2007

Similar Posts Version 2.0.0 beta is now available. Largely rewritten, it includes better handling of extended character sets, some new options, and many more possible styles of display. Version 2 now has its own page where future developments will be documented.

Comments are now closed for this post but can be added to the new page.

  • Version 1.14 just fixes two bugs. The Posts plugin were not working if more than one was installed! Also some users were getting odd characters appearing where they shouldn’t.
  • Version 1.13 allows some parameters (like ‘before_title’) to be blank, or to be more complex, e.g., 'before_title=
  • . Also allows trimming an excerpt so it ends with a word or a sentence and not in mid-word. NB excerpt_length is now counted in characters and not words as previously.
  • Version 1.12 fixes a bug with the option to show static pages.
  • Version 1.11 improves the sanity checking of parameter values to avoid database errors. Also lets you skip over a number of posts if you so wish (though this makes more sense for the related Recent Posts plugin).
  • Version 1.10 adds the ability to exclude certain authors on a multi-author blog. The text to show if no matches are found is now customisable and there is a new option for displaying links. All the options can be set via the options page but new in this version they can also be specified via a query-style parameter. This gives the flexibility to use Similar Posts in several places with different behaviour. I have also built three new plugins which use the same infrastructure: Random Posts, Recent Posts, and Recent Comments.
  • Version 1.03 adds the ability to exclude static pages or certain categories of post. It also has an improved stopword list based on the one used by MySQL.
  • Version 1.02 is a bug-fix and security release. It protects against a vulnerability where stray characters in the matching terms could cause database errors. This update also fixes a potential naming conflict between the internal names of its options and those of other plugins and establishes sensible default values for these options. When you update the installation you should visit the options page and check the settings are to your liking.
  • Version 1.01 restores the ability to use keywords or otherwise tweak the terms used to find similar posts. It also allows you to import keywords previously assigned using the Related Posts plugin.
  • Version 1.00

Description

I’ve been looking for a better alternative to the Related Posts plugin for WordPress. At least better on a blog like mine … and maybe yours too. Related Posts finds matches to other posts using the post’s title or any keywords that you care to define. The titles of my posts unfortunately reflect their context rather than their content and I have too many of them to go back and decide on keywords for them all. So I wrote a new plugin, Similar Posts,using a different algorithm which finds related posts based on the contents of a post and the pattern of word-usage rather than just its title.

You can see it in operation in the ‘Related Reading’ section of my sidebar (on single-post pages).

Instructions

  1. Download the latest version of Similar Posts.
  2. Upload the whole plugin folder (Similar_Posts) to your /wp-content/plugins/ directory. (Similar Posts can be installed without uninstalling Related Posts if you want to try out the difference)
  3. Go to your Admin|Plugins page and activate Similar Posts. This will automatically add an index to your posts table to enable fast matching. Don’t be alarmed if this takes a few moments.
  4. Put 

    at the place in your WP loop where you want the list of similar posts to appear. By default the plugin wraps each post with

  5. and
  6.  

    but that can be changed. Use the Admin|Options|Similar Posts page to set how you want the list of posts displayed.

Acknowledgements

Similar Posts is based on Related Posts 2.02 by Alexander Malov and Mike Lu. I’ve also used some code from Rich Boakes and Ken Cheung.

Under the Hood

By default Similar Posts scans a post each time it needs to find words to match. For long posts this can add an unwanted overhead. The plugin can speed up the process by caching the search terms as a custom field (named ‘similarterms’). When a post is published for the first time or subsequently edited the custom field gets updated. Since you probably have a lot of posts you won’t want to edit each one manually to cache the terms. Instead, the Admin|Options|Similar Posts page lets you process all your posts in one go. It won’t overwrite any terms that are already cached so there is also a button to clear all terms.

While you are editing a post you can modify the ‘similarterms’ custom field to add keywords or replace the automatically generated terms altogether. Note that the field will not be visible until you have saved the post at least once. If you ever want to regenerate the default terms just delete the current set.

If you have previously used the Related Posts plugin to assign keywords to posts you can now import them all from the Options page.

The Similar_Posts folder contains a file, ‘en.words.php’, which is used to supply the ‘stop list’ of common words that you want to exclude as search terms. It is supplied this way so that if your blog is in a different language you can use an alternative stop list. Similar Posts checks the WPLANG constant to see which language WordPress is using and looks for a file on that basis. For example, if the the language code is ‘fr’ for French, Similar Posts will look for a file ‘fr.words.php’. The language files must be in the same directory as similar-posts.php.

The way the list is displayed can be set from the Options|Similar Posts
page. You can exclude certain categories of post, for example, or
change the code that comes before and after the link.

These general options can be overridden in specific cases by passing a query-style parameter, e.g.:

<!--?php random_posts('limit=10'); ?-->
lists 10 random posts
<!--?php random_posts('none_text=sorry&#038;show_static=false'); ?-->
lists the default number of posts, excluding static pages, and specifies what to display if there are none

If you do not specify an option its value is taken from the options page.
This means you can use the template tag in different ways in different places.

The full list of parameters is as follows (with the default value in parentheses):

limit
maximum number of posts to show (5)
skip
how many posts to skip before listing (0)
show_static
include static pages (false)
show_private
include password-protected posts (false)
excluded_cats
comma separated list of categories to exclude (by ID) (9999, the default means none)
excluded_authors
comma separated list of authors to exclude (by ID) (9999, the default means none)
none_text
what to show if no posts match–can be plain text or a permalink
before_title
what to show before a link ()
after_title
what to show after a link( 

)

trim_before
remove the first instance of ‘before_title’ (false)
show_excerpt
include a snippet of the post after the link (false)
excerpt_length
how long an excerpt should be (50 characters)
excerpt_format
‘char’, the default, does nothing, ‘word’ trims the excerpt to the last full word, and ‘sent’ to the full sentence. If the excerpt would be trimmed to nothing no trimming is applied.
ellipsis
add ‘ …’ after the excerpt
before_excerpt
what to show before an excerpt ()
after_excerpt
what to show after an excerpt ( 

)

Feedback

If you try this plugin leave a comment here to let me know how you get on.

89 replies on “Similar Posts: WordPress Plugin”

  1. Orlando: Thank you. I’m glad it works well for you. Did you find or create a set of common stop words in Portugese? If so I’d be glad to include them for download.

  2. error, while downloading latest version,
    it has 187 bytes and contains:

    [root@50 plugins]# cat similar.zip

    Fatal error: Call to undefined function: filename() … [snipped]

  3. alex: it should be back in operation again … I had installed a plugin to monitor downloads and it wasn’t working properly. Thanks for the heads up.

  4. hi there
    i am having a bit of trouble figuring out where to put the code in
    i want the plugin to work on the post page and not in the main page, where should i insert the code to get that?

  5. Thanks for the plugin – it also gives me a better list of related/similar posts compared with WordPress Related Entries 2.0.

    Just wondering if you might be able to include options in the admin panel for removing static pages and certain categories from display as Ken Cheung has described in his post Excluding Categories from the Related Posts Plugin?

  6. Upekshapriya: Have a look at the latest version. I have added the exclusions you suggested. Let me know if they work well for you. I found a slight problem with Ken Cheung’s code for excluding categories but I hope I’ve fixed it.

  7. That’s brill. So much easier than fiddling around with the SQL in the code. A great improvement. Thanks very much.

  8. Brilliant. Much more accurate than related posts and better options.
    You saved me, thankyou 🙂
    One thing that would make it perfect though is the ability to have more fields to have apart from just title as a link.
    How about the options to choose from date of posting, author and number of comments?

  9. Thanks Becky. Those additions would certainly be possible. I’m a little concerned about the performance hit they might cause. I’ll look into the possibility.

  10. Thats a good point. I have no idea about coding and tbh installing your plugin was the first time I ever touched the loop. I was too scared before. Now I am not scared of it. lol

    Just thinking about it some more, I think really the only useful info would be the date of posting, anything else can be seen on the post page. But even then all that is really not necessary, its quite perfect as it is 🙂

  11. Jonas: Thanks! I’m always interested to get suggestions for new features. I guess in this case it all depends what you mean by popular. Do you mean most viewed–which would be hard to implement? Or most commented which would be easy? Or something else altogether? Let me know…

  12. The possibility is well on the way to being a reality–it is working on my test site–but I’ve implemented it as part of a major overhaul of how the plugin displays and formats it’s output so it will be a little while longer before I post it.

  13. How can I change the wrapping in the Admin Panel that will result with the html code below? Thanks.

    
    <li>Post Title <br/>
    Excerpt</li>
    <li>Post Title <br/>
    Excerpt</li>
    ....
    

  14. JH: You can’t right now but you will be able to in the next version which I am working on at the moment.

    You could get a similar effect by using a definition list instead of ordinary lists. Wrap the title in dt tags and the excerpt in dd tags (the options page makes that suggestion) and play with the css if you need to.

    Good luck.

  15. Rob,

    Thanks. I’ll look forward to it.

    Another suggestion is a field between Post Title and Excerpt fields where we can insert template tags like:

    
    <?php the_date(); >
    <?php comment_number(); >
    ...
    

    Thanks.

  16. On field,

    What to display if no posts can be found ? It could, e.g., be a simple message or a hyperlink to a favourite post.:

    can I add php code?

  17. JH: No, whatever you use will simply be echoed as text. I have no idea what might happen if you have a plugin installed which allows you to run php code.

  18. I was more looking into most popular for most visited. I guess this is not built-into WordPress that it calculates the number of visited? Otherwise I don’t know how to figure out most popular since most commented doesn’t always mean most popular?

  19. Jonas: I was afraid of that! I can’t see a way to access the most visited posts without using some kind of statistics code/plugin to keep track of that extra information. It wouldn’t be hard to do in itself but it would only cover the period after you started counting.

    Maybe that’s the kind of data that makes sense anyway–the most visited posts in a particular time-frame–but it would involve storing timestamp data for every post access. I don’t like the sound of that since I’m already overflowing my database quota…

    Opinions anyone?

  20. Thanks for this great plugin. It has worked very well, and I just love finding such great tools on the web from individuals like yourself. Very helpful!

    I am wondering if there is any way to see a full list of words which were used on any given post to match other posts to it. For instance, if I click on post number 10, and similar posts shows that post 34, 24, and 14 are related to 10, I would like to be able to see what keywords it determined were relevant to post number 10, so I can better weed out the most common keywords. Right now, I am finding the results to be too broad. I’m guessing I need to add more stop words, but I have nothing to go by.

  21. RA: Thanks for your comment. The words that are used to make the match are stored in a custom field called ‘similarterms’. You can view it on the write posts or edit posts page by making the custom fields block visible.

    If you want to add words to the list (or remove them) you can just type them in and then press ‘Update’. Pressing ‘Delete’ will get the plugin to regenerate the default set.

    Let me know if that helps.

  22. RA: For the lowdown you could read the documentation for MySQL full text indexing!

    Basically MySQL scores every post for how well it matches the query terms, something it call ‘relevance’. The manual says:

    Relevance is computed based on the number of words in the row, the number of unique words in that row, the total number of words in the collection, and the number of documents (rows) that contain a particular word.

    Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Consequently, a word that is present in many documents has a lower weight (and may even have a zero weight), because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row.

    I hope that helps!

  23. hi, this is a much more robust plugin than the related post plugin. i have a question though. is it possible to display the dates of the related entries instead of their titles? and how would i go about doing that?

  24. Tin: I’m working on a new version which will give you the ability to customise the output in the way you describe as well as and many others. I hope it won’t be too long before it is ready.

  25. Hey Rob, nice plugin…

    I am having an issue with the latest version though. I am getting a few characters of gibberish before the link to the first similar post. This only happens in the most recent post. It is happening on four different sites. You can see one via the url linked to my name on this comment.

    I was running version 1.10 with no issues and this occurred when I upgrade to the latest 1.13 version. Maybe I messed up something…

    Thanks…

    Mr Papa

  26. yes, it is still there with comments off…

    If I a make a new post, it gets fixed on the old post, but the new post now exhibits the problem…

    I have played with a bunch of other options combinations also with no luck…

    Mr Papa

  27. hi rob, i am getting this error:
    Problem creating the full text index for Similar Posts. Please check the instructions on how to create the index manually.
    Warning: Cannot modify header information – headers already sent by (output started at /home/ragmusco/public_html/blog/wp-includes/wp-l10n.php:43) in /home/ragmusco/public_html/blog/wp-includes/pluggable-functions.php on line 269

    Any suggestions? Thanks

  28. Tin: You have had the plugin working OK before and now it is giving an index error? Has the index been deleted or corrupted? Can you check in PHPMyAdmin or somesuch? The spec for the index is in the readme.txt file.

    Have you installed any new plugins in between it working and not working?

    I’ve never seen that kind of error before. Thanks for letting me know and I hope we can track it down quickly.

  29. hi Rob,

    i’ve used your plugin on one wordpress blog just fine this past week. i installed it on a second wordpress blog just recently and got that error. the error showed up when i activated the plugin for the first time on the 2nd blog.

    the PHPmyadmin shows a normal fulltext entry for similarpost. i don’t know if this has anything to do with conflicts with another plugin or not.

  30. Update: A Version 1.14 is an interim fix of two bugs. In one, some users were getting spurious characters showing up before the post listing. The other bug meant that the Post plugins which were meant to work together in fact caused each other to fail. D’oh!

Comments are closed.