Should search engines be able to datamine Akismet for obvious anti-webspam filtering?
For those who don’t know, Akismet is an anti-spam plugin originally for Wordpress, that according to reports, filters out something like 99.9% of spam before it even reaches the Wordpress admin panel.
The spam is automated form spam, that attempts to publish links to sites.
Now, I’m finding automated form spam becoming an increasing problem - it’s not simply hitting Wordpress, but anything with a POST element. And that includes any type of contact form, guestbook, forum, blog, and similar - regardless of platform.
This is primarily powered by affiliate marketers in the world of Porn, Pills, and Casinos, and I’ve seen the majority of form spam attacking my own sites coming from Eastern Europe and Russia.
The aim of the automated form spammers is to increase their search engine rankings - by forcing sites to unwittingly publish links which may be used by search engines to improve a sites ranking on Google, Yahoo, MSN, etc.
So - what if search engines were able to datamine the Akismet data?
That would immediately allow them to better identify which websites are actively involved in form spamming, and automatically ban them.
But what about false positives?
For starters, some automated form spammers will publish URLs to established sites - especially search engines themselves.
And sometimes sites are simply unfortunate enough to trigger a false positve through no fault of their own.
Additionally, some form spam also tries to link to pages they have already spammed.
However, search engines could account for this by referencing links against a basic authority scoring - simply put: established sites are much less likely to be automated form spamming than brand new affiliate marketing sites.
So these can be filtered out, and anything else can be removed from their search index that fails the scoring.
Now, here’s the interesting part:
If search engines were to make this a public fact of policy, that they datamine spam filters such as Akismet for offenders to penalise, then surely this would kneecap the entire automated form spam industry?
Resulting in less spam for site owners, less wasted bandwidth for carriers, and - of course - less hardcore spam in search engine results?
Isn’t this a win-win-win situation for everybody - except the spammy affiliates?
Of course, the single Achilles-heel is setting a threshold to ensure that genuine sites are not easily removed.
But with search engines compiling so much data profiling websites in the first place, shouldn’t it at least be possible to vastly reduce the potential for abuse, and ensure false positives were exceedingly rare?
Related posts to:
"Should search engines use Akismet?":
2 Comments » Leave a comment
Previous: « Google failures take umbrage
Next: iPhone - I want one! »
Visited 3118 times, 5 so far today since July 24th 2007