Similar Posts

Caveat

Unfortunately, due to ill-health, this plugin has not been developed or supported properly for some years. It works with the latest versions of WordPress (including on this website) but could possibly conflict with any WordPress features added after 2008 — e.g. custom post types — if you use them.

Purpose

This plugin displays a list of posts which are related or similar to the current post.

This is version 2.6.2.0 download latest version. It is compatible with WordPress 1.5–2.6.2.

  • 2.6.2.0 fixes a problem with stemming and stop words and offers a new fuzzy matching capability; supplies a .pot file making internationalisation possible; introduces the {imagealt} output tag and allows {excerpt} to output whole sentences; the content filter and the widget can now take a parameter string; output can be automatically placed after post content without editing theme files.
  • 2.6.1.2 fixes the German-language stemmer which should have been encoded as utf8.
  • 2.6.1.1 fixes the Italian-language stemmer which was crashing under PHP4.
  • 2.6.1.0 allows the current post to be marked manually where the automatic mechanism fails; when used as a widget the plugin now honours the setting to show nothing when there is no output; {commenterlink} now applies the appropriate WordPress filter; and fixed a problem with some installations not finding the right language files.
  • 2.6.0.1 fixes the option to include attachments and adds a parameter to the {imagesrc} output tag to append a suffix to the image name.
  • 2.5.0.11 has a new option to include posts which are attachments; a new output template tag {authorurl} which points to the archive of the author’s posts; new behaviour for the {php} output tag which can now accept other output tags in the code; and includes a fix for MySQL problems in some locales’.
  • 2.5.0.10 provides the ability to select from two algorithms for term extraction; allows you to specify post relationships by hand; and fixes a problem indexing tags in some languages.
  • 2.5.0.9 adds an option to match the current post’s author and extends the options for snippet and excerpt output tags to make the ‘more’ text into a link.
  • 2.5.0.8 adds an option to show posts by status, i.e., published/private/draft/future, changes the {categorynames} and {categorylinks} output tags by applying the ‘single_cat_name’ filter, and fixes a bug in WordPress pre-2.2 that stopped installation code running on Windows servers.
  • 2.5.0 improves the CJK matching algorithm by using bigrams. Also introduces a new output tag {imagesrc}, and adds more parameters to {image}. Fixes bugs with empty categories, excluded posts, and the option to omit current posts.
  • 2.5b28 improves the matching algorithm and adds an experimental mode for blogs in Chinese, Korean, or Japanese.
  • 2.5b27 fixes a bug with the bulk indexing of tags.
  • 2.5b25 makes some important changes: the {image} output tag now serves real thumbnails (couple of bug fixes too); output can now be sorted as you choose with sub-headings included; the {date:raw} tag modifier has been added to help the sorting; the ‘trim_before’ option has been replaced with the more logical ‘divider’.
  • 2.5b24 fixes to stop recursive replacement by content filter, {gravatar} to allow for ‘identicon’ etc., to {commenter} to allow trimming, and to remove a warning in safe mode
  • 2.5b23 brings a new option to filter on custom fields and adds proper nesting of braces in {if}.
  • 2.5b22 moves the manage menu under settings as a subpage, restores automatic indexing on activation, fixes conflicts with the legacy Similar Posts Feed plugin, fixes bugs in several output tags, and introduces the option to show only pages.
  • 2.5b20 doubles the speed of indexing and reduces the memory footprint considerably.
  • 2.5b19 fixes a bug when snippets are stripped of extra tags.
  • 2.5b18 fixes a problem with filtering the output and introduces the conditional tag {if:condition:yes:no}.
  • 2.5b16 fixes a problem with {php}.
  • 2.5b15 fixes for some more installation problems, one or two bugs, and adds the ‘included posts’ setting.
  • 2.5b14 fixes for some of the kinds of installation problems.
  • 2.5b11 fixes some widget problems.
  • 2.5b10 fixes (some?) of the problems folks have been having with no posts found. Most of such errors seem to arise when the proper table is not created and this version addresses that.
  • 2.5b9 has new features and improvements.
  • 2.3.6 restores the widgetiness I managed to remove in 2.3.5!
  • 2.3.5 has been rebuilt to save memory and can match the current post’s tags. It also fixes a bug with categories in WordPress < 2.3.
  • 2.3.4 now works as a widget.
  • 2.3.3 beta adds the ability to include as well as exclude categories and authors and is able to find posts by tag.
  • 2.3.2 beta fixes a conflict between tags and categories.
  • 2.3.1 beta fixes a stupid bug in category exclusion.
  • 2.3.0 beta is compatible with WP 2.3, fixes the {author} bug, and a number of problems related to versions of MySQL.
  • 2.1.1 beta fixes a badly chosen fallback value for the number of terms used to match similar posts.

Ideally, similarity or relatedness would be based on a post’s meaning. Tagging systems try to add meaning after the fact but suffer from two deficiencies, one practical and the other theoretical. When a blog already has many posts it can be impractical to retrofit a tagging system by tagging every post by hand. ‘Automatic’ services, like Yahoo’s, tend to produce too many suggestions which need to be culled, again by hand.

The theoretical problem with tagging is that it tries to pin down a meaning for a post by categorising it under a small number of types, whether those types belong to a predetermined hierarchy or arise by ‘folk’ classification. In fact, a post has a variety of meanings, a multitude of ways it can be related to other posts. Meaning doesn’t just lie in the intention of the author or in the classification of the reader; meaning also inhabits the text itself. Meaning is in the words.

The Similar Posts plugin compares posts by comparing their words. MySQL has a sophisticated full-text searching facility with a carefully tuned algorithm for judging the similarity between texts. Similar Posts extracts representative words from a post’s content, title, and tags and uses the full-text index to find the best matches between posts. This simple approach gives surprisingly good results.

The results can be tweaked in several ways to tailor them for you blog. By default the plugin chooses the 20 most frequent words to make its matches but the number is adjustable. It is worth experimenting to see how many words gives the best results for your blog — it has hardly any impact on speed, even if you set the value high enough to include the whole post. The relative importance given to words in your title may be adjusted so that well-chosen titles can be used to advantage or titles with little relevance downplayed. Similarly, tags can be used to improve matching or not according to your blog and it’s needs.

It is also possible to override the automatic similarity ranking by using a custom field. In the post edit screen create a custom field called ‘sp_similar’ with the ID value of the post to which you wish to ‘link’. You can link to multiple posts by entering a comma-delimited list of IDs.

The plugin has a settings page which lets you change how the output is generated and displayed. There is also a management page where you can change settings which affect the index.

Note: Similar Posts needs to know the ID of the post for which it is generating related posts. WordPress keeps track of that information in a global variable but unfortunately some other plugins can corrupt the data before Similar Posts gets a chance to use it. Similar Posts tries various tricks to get round this but sometimes it fails. The usual symptom is a list of similar posts that stays the same from page to page. You can help Similar Posts out by marking the current post manually by adding a line to your theme files. Find the place where the_content(); is used to display the current post and right after it put similar_posts_mark_current();.

Installation Instructions

  1. If upgrading from a previous version, first deactivate the plugin via the Plugins page and delete the plugin folder from your server.
  2. If you have been using the Similar Posts Feed plugin you should deactivate it as it is now obsolete.
  3. Upload the plugin folder to your /wp-content/plugins/ directory. You will also need to install the Post-Plugin Library.
  4. Go to your admin Plugins page and activate Similar Posts. This will automatically add a new table to enable fast, flexible full-text matching. If the plugin reports that there was a problem creating the table first try deactivating and reactivating the plugin.
  5. Put<!--?php similar_posts(); ?--> at the place in your theme files where you want the list of similar posts to appear. Lorelle on WordPress has a good guide to modifying themes for plugins.If you are averse to editing template files you can also place the post listing automatically either as a widget in the sidebar of your widget-aware theme or after each post (from the plugin’s Placement submenu).
  6. Use the admin Settings|Similar Posts pages to set all the available options. Alternatively, the options can be overridden by passing a parameter to the similar_posts template tag.

Usage and Options

The configuration page will help you to set up the plugin to your satisfaction.

The Index Management Page

Using this settings subpage you can re-index your blog. There are two main settings which affect the indexing.

PHP is, by default, not very good at handling text that isn’t in English and you might find Similar Posts mangles extended characters. If so, you can get the plugin to use PHP multi-byte string library if it is available.

The second setting attempts to handle words with related meanings. For example, ‘animal’ and ‘animals’ should probably not count as two distinct words, nor ‘follow’, ‘follows’, ‘following’, etc. You can choose to build the index using a stemming algorithm that groups such words as one (if there is one available for your language) or you can try the fuzzy matching algorithm. Whether it is better to be strict or to be relaxed will depend on your website.

A third setting is for blogs written mainly in Chinese, Korean, or Japanese. The MySQL fulltext index used by Similar Posts has problems with these languages but this setting tries several ways to work around the issues. The setting currently only works when posts are encoded as UTF-8. I would be very glad to get opinions from users familiar with these languages.

To avoid excessive memory use the indexing routine processes posts in batches of 100. This figure can be reduced to shrink the memory consumption even further.

Language Issues

The underlying MySQL full-text indexing is obviously very locale-dependent — how words are divided or punctuation handled, what words are treated as noise, etc. all vary from language to language. For the Similar Posts plugin to work well the version of MySQL on your server must be properly setup in the appropriate language.

Similar Posts generates the terms it matches on by analysing the word frequency of a post while ignoring the most common ‘noise’ words — in English, words like ‘of’, ‘and’, ‘across’, ‘someone’, etc. It uses a so-called ‘stop list’ of common English words to ignore. In fact it uses the stop list a standard English installation of MySQL uses. Obviously this list will be useless for other languages so Similar Posts makes the stop list pluggable.

The Similar_Posts folder contains a subfolder, ‘languages’, with stop lists and stemmers for a German, English, French, Spanish, and Italian. The plugin checks the WPLANG constant (defined in wp-config.php) to see which language WordPress is using and looks for a file on that basis. If WPLANG is undefined or the appropriate file cannot be found the default English list is used.

If you are looking for help setting up a stop list in a language other than English a good resource can be found at http://www.ranks.nl/stopwords. Stemmers in PHP are harder to come by. You can work out how to adapt any you find by inspecting the provided stemming files

414 replies on “Similar Posts”

  1. Hi, thank you so much!!! You are my hero. The code works just fine. It’s funny but suddenly
    my list of 21 posts appears but I lost one post from a list of five. (I don’t know if it ever was there, I just noticed it). But I have been adding the tags once more. I will check out the other plugs that you suggests. The output looks really nice now. I’m linking tasting notes/score to a main article. So I hope the plugs you suggest are as refined as you plug. 😉

  2. Helena: I’m glad it’s working properly at last. Random Posts and Recent Posts are also my plugins and have the same set of options so you could just slot them in instead.

  3. Also the German stemmer.php is not UTF-8 encoded and is causing a crash when saving/autosaving a post. Converting it from WIN-ANSI to UTF8 will fix this.

  4. After a bit of fiddling I have managed to get similar posts template tag working after modifying the loop and creating some basic CSS.

    Just to let you know that previous to this plugin I was using the equivalent function available through the Simple Tags plugin and wanted to let you know that what ever algorithm you have implemented to select similar posts – it’s absolutely magic!!

    The similarity of posts now displayed since moving to your plugin is unbelievable.

    Cheers,

    R

  5. Robert@PNG: I’m very glad to hear that Similar Posts is up and working well for you!

    After replying to your comment yesterday I found myself wading into creating the code necessary to do automatic placement of similar post lists. Easy at first … and then I got into refactoring code to make it shareable between the four post plugins … and fixing interactions with the in-post placement code … and making sure that excerpts and snippets don’t generate infinite loops as they append themselves to themselves… A lot of fun! I think the result will a big improvement on several fronts … once I get the other half of the code sorted out.

    So, thank you (I think!).

  6. meehawl: That is indeed odd. Looking at your page;s source I wonder if it is because the Similar Posts listing comes before the post loop? If it does it might account for the strange behaviour — the plugin has to know which post is being displayed so that it can look for something related.

    I may be wrong — your theme may be doing something clever — but you would have to change the order of the code so that the similar posts plugin comes after the ‘loop’.

    Make sense?

  7. Thanks, that was it. When I move the Similar Posts widget to sidebar \”2\”, executing on the rhs of the page it works, presumably ensuring that it executes after the loop.

  8. Hi, first i want to say thank you for this plugin. It’s easy to use. Can i use this on 404 page? I mean, if someone looking for something and land on the 404 page, they will found related post with what they looking for. How/what code i need to put on my 404.php. Hope you understand my question:)(sorry for my English).

  9. Phoenix: You could use Similar Posts to do this kind of thing but so far I’ve advised people requesting it that it is probably better to use search rather than relatedness. The reason is that the similarity algorithm is designed to use (relatively) large blocks of text from a post, title etc. to make the match but on a 404 page you would only have a few words to play with and a search would, I believe, give better results.

    However, I get lots of requests for this so maybe I should actually try it out and see what happens…

  10. Sorry im blonde and confused. Which file do i have to edit so it shows on my main page and a single post? im rubbish at html im only used to installing easy plugins but the ‘yarp’ plugin didnt work for my version.

  11. Hayley: Can you wait for a day or two? The next release of Similar Posts will have automatic placement.

    Otherwise you will need to edit two files in your theme. If you want to have a list right after the single post’s content open single.php and add the following line after the line where ‘the_content’ is output:

    
    <?php if ( function_exists('similar_posts') ) similar_posts(); ?>
    

    You can do a similar thing with the index.php file to display the list after each post on the main page.

  12. PLEEEASE HELP ME! What is wrong with the following code:

    {php:$thumb=get_post_meta($result->ID, \'image\', true);if ($thumb)echo img src=\"http://www.stereopoly.de/wp-content/themes/freshnews/thumb.php?src=$thumb&h=60&w=60&zc=1&q=80\" alt=\"test\" align=\"right\";}

  13. Axel: Nice site!

    It is hard to tell in the abstract what is wrong — there are a couple of things that might be wrong. Does the template produce any output? If so what?

    First I don’t think you should need to escape the quotes. Second I think the string being echoed should be quoted. Third there seems to be missing angle brackets (though they may be just not shown here).

    My usual tactic forming {php} ouput template tags is to make sure the PHP runs correctly if inserted directly into the template file. You may have to fill in certain values (like $result->ID) but it helps catch any errors.

  14. I know previous versions of this and the Post-Plugin Library had problems using the automatic upgrade included with WordPress. Has this been fixed yet and if not when can we expect that to happen?

    I don\’t mind doing upgrades the old fashioned way but being able to do it automatically for all my other plugins but not this is a major pain.

  15. Hi Rob,

    I’m trying to use Similar Posts to display images that I’ve entered in the Excerpt field of the Write Post page.

    However, using the {excerpt} tag when configuring the Output options just generates an empty list in my similar post section.

    If I change the tag to some text, the output is displayed as expected.

    Does this mean that I simply can’t use {excerpt} to display images in the Excerpt field or is there some work around I can use?

    Cheers.

  16. Ian: The {excerpt} output tag has a number of options. the first version I wrote a long time ago didn’t properly pay attention to pre-existing excerpts but if you use the ‘b’ parameter a rather better algorithm does, e.g.,

    {excerpt:30:b}

    or, if you are happy with the default number of words, just use,

    {excerpt::b}

    I haven’t tried this with images in the excerpt but can’t see why they shouldn’t work. Let me know.

  17. Hi Rob,

    That’s brilliant!

    Using the default number of words option:

    {except::b}

    has worked perfectly with images.

    Thanks so much for this Rob, you make my (WordPress) life so much easier 🙂

  18. I’ve got a problem with Similar Posts: it overrides the settings of another plugin “All in one SEO Pack”. All In One SEO Pack rewrites the title of each posts like “post title” | “blog title” but since the latest version of Similar Post it only shows “Blog title” everywere as the page title. Any clues how I can resolve this?

  19. Ok, I’ll do my best to explain myself. All In One has a feature to optimize pages by rewriting the titles of the pages ( the tags) and conveting the titles to | . Well, without Similar posts activated, that works perfect. But when Similar posts is active, all the pages, all the posts pages of my blog have as title. This bug is new, as it didn’t happened with other versions of SP.

  20. dazz: What I was trying to say was that All in One SEO is working properly alongside Similar Posts on my test site. There must be some difference between your setup and mine and that difference we need to find out in order to understand your problem.

  21. I really enjoy the plugin, I just installed it last night, and it is working great on my site http://www.kniivila.net. However, there is a small issue that seems to be related to the plugin, as I can’t recall this happening (much) before I installed the plugin, but now it is happening all the time: when creating or editing a new post, I get a totally blank page after pressing the “save” or “publish” buttons. The post is saved fine, however. Any idea why this is happening? I’m just guessing, but maybe the data base is responding too slowly to some request the plugin is making when saving the post?

  22. Kalle Kniivilä: That’s an odd one! I haven’t heard of this issue before. One thing you could check is whether increasing the memory limit for PHP makes a difference. WordPress 2.5+ tries to increase the memory to 32M but5 you can increase that by putting

    define('WP_MEMORY_LIMIT', '64M');

    into your config.php file. Or you could do it in a php.ini file.

    Also, if you have access to your PHP error logs that may shed some light on what is happening with the blank page.

  23. I added the memory limit code at the end ov my wp-config.php file, but that didn’t seem to change anything. Is that the right place for it? I think it was parsed, anyway, because first I happened to place it after the PHP end code, and the line turned up at the top of my page. 🙂

    I had a look at the PHPMyAdmin interface, but really couldn’t make out where I should look for the error log.

  24. I have an error log file for the blank page, it includes items like tthis:

    Notice: Undefined index: wp_cron_daily in /customers/kniivila.net/kniivila.net/httpd.www/wp-includes/plugin.php on line 583

    and a lot of

    Notice: Undefined property: stdClass::$args in /customers/kniivila.net/kniivila.net/httpd.www/wp-includes/taxonomy.php on line 1066

    and then

    Warning: Cannot modify header information – headers already sent by (output started at /customers/kniivila.net/kniivila.net/httpd.www/wp-includes/plugin.php:583) in /customers/kniivila.net/kniivila.net/httpd.www/wp-includes/pluggable.php on line 689

    Any idea what this is? I can’t see anything directly related to your plugin, but everything worked OK before I installed it. And last night, when I uninstalled your plugin and then reinstalled it, everyghing worked fine. But when I posted a new text today, I got the blank page thing again…

  25. You asked in the post 276. Here they are.

    # Russian stopwords, charset koi8-r
    #Charset: koi8-r
    #Language: ru

    а
    без
    более
    бы
    был
    была
    были
    было
    быть
    в
    вам
    вас
    весь
    во
    вот
    все
    всего
    всех
    вы
    где
    да
    даже
    для
    до
    его
    ее
    если
    есть
    еще
    же
    за
    здесь
    и
    из
    или
    им
    их
    к
    как
    ко
    когда
    кто
    ли
    либо
    мне
    может
    мы
    на
    надо
    наш
    не
    него
    нее
    нет
    ни
    них
    но
    ну
    о
    об
    однако
    он
    она
    они
    оно
    от
    очень
    по
    под
    при
    с
    со
    так
    также
    такой
    там
    те
    тем
    то
    того
    тоже
    той
    только
    том
    Ñ‚Ñ‹
    у
    уже
    хотя
    чего
    чей
    чем
    что
    чтобы
    чье
    чья
    эта
    эти
    это
    я

  26. Daz – What you could possibly do is edit your themes header.php file and manually add your sites name in the title ie something like:

    <?php wp_title(); ?> | MySite.com

    Then just omit that from your AIOSEO settings.

    Rob – Firstly thanks for the great plugin, it’s the best related post plugin out there.

    What i’m wondering, is there anyway to optimize it for very large sites by creating other indexes in the database or something similar?

    In the database i have

    Data 161.2MB
    Index 323.5MB
    Total 485.6MB

    So the site is pushing almost half a Gig in the wp_similar_posts table. So any tips on optimizing things for such data?

    Thanks Rob.

  27. Hello!

    I’m having an HTML validation problem that may be related to Similar Posts. When I enable “Output after post”, a single, extra closing paragraph tag appears in the source after the “Similar Posts took…” comment, preventing validation.

    Is anyone else seeing this?

    Thanks for the great plugins,
    Demetris

  28. Have now the ‘none found’ message for the last 2 days.

    I didn’t change anything, no new plugins, no settings changed. Just published another post as the only activity during the last week.

    Now I have the ‘None Found’ Message again (see right sidebar of my site).

    Tried to rebuild the index, ease the setting restrictions, but no avail.

    Unfortunately I don’t know where to put in the table command (php SQL), but the table should be there, as it worked before.

    Latest Version 2.6.2.0 and 2.6.2.1 for the Post Plugin Library.

    Great plugin, so far no problems, but now.

    What else can I try? Thanks for your help!

    Chris

  29. Is there a way to make this display horizonal instead of vertical.. I am useing all thumbnails want them sideways. It seems that by default the first related post in the list is give the UL value so it screws up.. Is there a way to get around this? also my “add a comment” box is bleeding getting pulled up onto the same line as the list. Thanks

  30. Hi, thanks for Your plugin Similar Posts. It ‘s very usefull. I have the following question about omproving the plugin while matching categories:

    In use Similar posts in my wordpress-blogg, where all posts are getting multiple categories. Let’s name the categories for example “A”, “B”, “C”and “X”.

    Now I want the plugin to list similar posts matching any of the categories “A”, “B” or “C”. The category “X” should be ignored. That means: I don’t want to exclude the posts only because they are in category “X”. I just want the plugin to ignore the category “X”.

    This would give for example the following results:

    Take two posts. The first one gets the categories “A + X”, the second one “B + X”. With this combination the plugin should identify the two posts are not similar.

    Now take two other posts. The first one gets the categories “A + B + X” and the second post “A + C + X”. Given this combination I wish the plugin to identify both posts are similar, despite category “X” is excluded in the plugin.

    My question: Is it possible to ignore certain categories in this way? What do I have to do, to set up the plugin?

    Sorry for my English. With kind regards,
    Rudiger

Comments are closed.