Welcome, visitor! Log in
 

KB Plugins blog

The best Wordpress plugins are free

New plugin: KB Spam Blacklist

As if there weren’t enough anti-spam plugins, right? Actually, this isn’t a traditional anti-spam plugin. This is a regular-expression based blacklist plugin. And by blacklist, I mean blacklist. If a comment matches one of your regexes, it gets deleted immediately, not sent to moderation.

Why use this plugin?

So why would you want this? I don’t know about you, but 90% of my Akismet spam is really obvious spam. It contains obscenities, BB code ([url...]), “payday loan” offers, and other things that are really obvious. I don’t want this stuff in my spam queue–I want it shot on sight. That way, the spam queue only has stuff in it that might actually be genuine comments that were miscategorized as spam.

How it works

In short, this plugin takes some of the load off of Akismet by looking for really obvious stuff. It’s easy to use. Just activate the plugin–that’s all. It comes with four regular expressions. You can add to, modify, or remove these if you want. This is the default blacklist:

<?php $kb_spamBlacklist = array(
	// First, let's check for [url]...[/url] markup, a sure sign of a spammer (unless you're using a bb code plugin)
	'~\[[^\]]*url[^\]]*\]http[^\[]+\[/url[^\]]*]~i',

	// profanity and obscenity. Remember that spammers use ! for i, @ for a, * for u. 
	// Also, most of these get surrounded in \W so that, e.g. ASSume doesn't get mistaken for profanity.
	'~(\Wf[u\*]ck|x{3,}|\Ws[u\*]ck[i!]ng|\Wt[i!]ts?\W|\W[a@]s{2}\W|v[a@]g[i!]n[a@]|\Wc[u\*]nt|pen+[i!]s)~i',	// depending on your audience, you might not want this one

	// Now let's check for ... interesting ... pharmaceutical offers
	'~(c[i!][a@]l[i!]s|v[i!][a@]gr[i!]?[a@])~i',	// remember that they sometimes use ! for i and @ for a

	// payday loans, anyone?
	'~(credit|loans?).*(credit|loans?).*(credit|loans?)~Usi',	// if they use "credit" or "loans" too many times in their comment, kill it.
); ?>

Try it out

If a comment gets caught by one of those regexes (or by another that you add), the commenter sees an error message. If you want to try it, write a comment that uses viagra, cialis, or [url]http://buy-junk.com[/url] in it and see what happens.

Also includes a widget, if you’re into that. Look in the sidebar.

Download it

Download KB Spam Blacklist v1.0

Dazzled? Confused? Disagree? Write a comment »
There have been 10 comments so far.

10 Comments

  1. sheri (Unregistered)
    Posted May 14, 2008 at 10:43 am | Permalink

    Thanks for making this available. I modified it for use with my SMF forum. It’s caught all my test spam messages and let the real ones through. I’ll let you know how it works with real spammers (I get about 20 a day!).

  2. Posted February 15, 2009 at 1:52 pm | Permalink

    Exactly what I was looking for!

    I’m tired of having to double-check the spam for the really really obvious stuff. I’ll probably use it as is but add a “url=http” search (unless your line already catches it).

    Anyone using this with WordPress 2.7x?

  3. Posted February 16, 2009 at 10:19 pm | Permalink

    Help, I can’t figure out the characters in the code.

    Can someone help me code it to look for “url=http” ?

    Most of my spam follows that format.

  4. Posted February 19, 2009 at 8:16 am | Permalink

    Gary,

    The way this is coded, it should work in just about any version of wordpress. It’s very simple. I’ve got it running in 2.7.1.

    I’ve found that it’s easier and more reliable to look for the close tag [/.u.r.l] (I had to use those dots to evade my own filter) than the url=http tag. Like this (but delete all the dots):

    '~\[/u.r.l[^\]]*]~i',
  5. Posted February 20, 2009 at 12:07 am | Permalink

    Thanks Adam!!!

    Didn’t really know how to code the regex into php.

  6. Posted February 28, 2009 at 8:32 pm | Permalink

    This has cut out SO much of my spam. As thanks, I went to donate to you and I see you won’t take money; so as per your suggestion I sent money to the Red Cross.

    My last bit is I’m trying to auto delete where h.t.t.p (without the periods) appears 5 or more times. That would leave me with very very little left to examine by hand (probably 10-20% of what I usually have to sort through).

    I’m looking at your last line of the blacklist (above) I’m assuming if any combination of the two words appear at least 3 times it kills the comment. But the (3) question marks are throwing me off and the part at the end after the tilde is different than the other lines.

    As it stands can I change the last like by putting h.t.t.p in the parenthesis by itself (removing the current characters) and it’ll delete 3 or more?

  7. Posted March 2, 2009 at 11:20 am | Permalink

    You ask about this (without some of the dots):

    '~(cre.dit|loa.ns?).*(cre.dit|loa.ns?).*(cre.dit|loa.ns?)~Usi',

    The question mark means “the preceding character may or may not exist.” So loa.ns? will match lo.an or lo.ans. You don’t need that for what you’re doing.

    The ~Usi gives three flags to the regex. i makes it case-insensitive; s allows the * to include newlines, not just spaces; U makes the * ungreedy.

    Try this (change ht.tp to http, but keep the dot in //.*):

    '~ht.tp://.*ht.tp://.*ht.tp://.*ht.tp://.*ht.tp://.*ht.tp://~si',

    A shorthand for that would be this (change ht.tp to http):

    '~(ht.tp://.*){5,}~si',
  8. Posted March 2, 2009 at 10:30 pm | Permalink

    Without question this will cut down my spam significantly!

    Thank you!

    (And I’m not putting a U in the new examples. I read the “greedy” parts from the flags link you provided and I’m still fuzzy, but that’s okay since it seems to be working.)

  9. Posted July 18, 2010 at 11:29 pm | Permalink

    Better than akismet? Seems great.. but what about the false positive rate? Any information?

  10. Posted July 19, 2010 at 11:12 am | Permalink

    When it finds a spammish comment, it shows an error message to the user. If it’s a human, they’ll read the message and understand that they’ve triggered a filter. Then they can just go back and remove the offending stuff.

    But really, there’s very little reason for a real comment to violate any of the plugin’s rules unless, for example, your blog is all about male-enhancing drugs. But if that’s the case, just delete the line that looks for pharmaceutical offers.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Comment Guidelines
  • Yes, your comments will be visible to everybody. You can disagree, but civilly.
  • I have the right to delete abusive comments.
  • Allowed HTML: <a> <b> <blockquote> <cite> <code> <em> <i> <strong>
  • Put code in `backticks` (above your "Tab" key) or it won't display well
  • Gravatars: To override the default image by your comment, use a gravatar.
  • Links: If you include more than one link, your comment will go into the spam queue.

Please read before commenting: Because I am now employed and not just a student, I provide only minimal support for my plugins. Sorry.

If you have a bug report, feature request, or other general feedback about a plugin, please leave a comment—but do not expect an immediate response. If you are requesting help, though, please check the plugin's documentation thoroughly rather than ask your question as a comment.

Thank you for your understanding.