Blog comment spam is a big problem. I started out with no protection on this blog. I found out rather quickly that wouldn’t do.
So I implemented a CAPTCHA which requires you to enter a random code to prevent automated comment submissions. Then I found out actual humans were submitting comments (or perhaps very smart anti-CAPTCHA programs).
My next step was disabling auto-approval unless you had one pre-approved comment. But that only resulted in me clearing out spam every day and discouraging real-time discussion because comments wouldn’t show up immediately.
After a long time, I finally found Akismet.
Akismet is anti-spam done right.
Comments get automatically submitted to a blog anti-spam service where, first, they are submitted to hundreds of tests to see if it’s spam.
The second part is the key, though. Because Akismet is a service anyone can use, thousands if not millions of people use Akismet for the same reason, and this is where its power lies. The second part of the anti-spam checks is to compare the comment with millions of other blogs that also use the service. More than likely somebody already has gotten your comment, or one like it, and marked it as spam. So when it gets to you, it’s already considered spam and not published. You can decide what to do with it in your admin interface.
Google Mail also does anti-spam right. They operate on the same principle as akismet (who were probably inspired by gmail in the first place). Basically, tests are run on the sender of your email, the email itself and then the email is compared with the billions of other emails that other users of Google Mail also get. If it looks like spam based on any of these checks, it goes in your spam folder.
This is the beauty of distributed effort.
When so many people are pooling into a system you really can make spammers largely ineffective – to the point that it’s no longer worth it for them to spam.
What we need is a distributed system for anti-spam checking at the smtp level for regular system admins. Imagine the entire world pooling into this system. I have yet to try Distributed Checksum Clearinghouse. It does something like what I would like but not quite. It’s not exactly like Gmail or Akismet’s mechanisms to tag spam.
It should be clear: In a world of anti-spam done right, spam largely goes away.