Spam comment tale continues

Submitted by dag on Sat, 2008/05/10 - 15:39

Things are getting worse at the spam comment front, whereas I used to get about 2 to 3 spam comments a day (or comments that look very real but advertise a commercial website nevertheless), I now have attracted more people that leave unwanted comments. Up to 20 a day, worse than my mailbox *with* spamfilter.

This is bad...

80% of those comments are added to a single article which alarmed me, investigating this a bit I found the following sites in my top referrer list. This digitalpoint.com forum seems to be used for advertising high PageRank sites so people can spam and advertise their own sites.

So Mollom is of no use, a difficult CAPTCHA is of no use, I really do not know what to do. To make it easier to find real comment spam I am probably going to disable the URL-text field so that saves me one step (checking of the URL is commercial) because 20% of the spam comments actually offer quality content and look very serious.

Another problem with comment moderation is the fact that it requires a lot of mouse-clicks and page rendering just to mark a comment spam and send it to Mollom. (20 times a day) It could be easier if I could mail the content of comments to myself, that way I do not have to open the comment, read it. On top of that, mailing it to me could be useful to have my mail spam-filter have a go at it.

At days it feels as if I am fighting a lone battle against the Internet. And I think I am loosing it.

Update: Thanks everyone for giving me clue about the rel="nofollow" attribute. I obviously stopped tracking Web development before such a thing existed (and was exploited).

Today I looked into Drupal 6's capabilities for adding such attributes and I stumbled across the line 1567 in includes/theme.inc and discovered a bug:


$output = l($object->name, $object->homepage, array('rel' => 'nofollow'));

should become


$output = l($object->name, $object->homepage, array('attributes' => array('rel' => 'nofollow')));

for the existing functionality to acually work. This will set rel="nofollow" for all homepage links contributed by unverified users. I forwarded this fix to Dries Buytaert who was also concerned with my high rate of comment-spam :)

Update 2: http://drupal.org/node/258120

nofollow

They seem to be targeting pages with a high pagerank which don't use "nofollow" attribute, in an attempt to increase their own pagerank.

I have the same problem on a few pages. They are often difficult to spot because they look normal, except for the homepage link.
I added a quick hack to my template.php file to add a "rel=nofollow" attribute to the homepage links. That should make the page worthless to them and hopefully stop new such comments.

You can set your default input format do do the same thing for comment bodies (its a setting).

make sure the outgoing links

make sure the outgoing links of comments contain rel="nofollow" in the <a>-tag. this instructs all those googlebots not to follow that link, decreasing the value of comment-spamming to almost 0. guess this should become default in the drupal-module responsible for the comment-rendering functionality?

good luck!
frank

That is despicable!

I can't believe they're so openly encouraging spam at digitalpoint.com! That should be against the law...

Reading the thread you linked to makes me so mad, but "what can men do against such reckless hate..." err spam

A sollution?

(this is not a spam post)

They call the sites they link to "dofollow" sites. This is because there is something called nofollow that indicates that search engines should not attribute pagerank to those links:

< a href="http://someuntrustedsite.com" rel = "nofollow" > the link < / a> (without all the spaces). If this doesn't happen on your blog, change your software such that rel="nofollow" is added to links that are posted by users.

Don't lose hope

Fighting spam is a pain in the ass for all of us, but don't let it stop you from blogging. I recently had a similar problem with a massive increase in email spam - but some preventative measures - http://blog.aplpi.com/index.php/2008/04/22/backscatter-and-joe-jobs/ seem to have mostly gotten things back on track. I'm sure you'll come up with a similar solution for your blog (and I look forward to reading about it, we don't get enough traffic yet to warrant that level of spamming but I hope it's only a matter of time :)

Just when you thought your industry could sink no lower...

I have to admit this is a new one for me. I've been aware of the criticism of rel="nofollow", and this "Dofollow" movement seems to be attempting to paint themselves as being opposed to it on principle, but...

Then they set up lists of links to open blogs, encouraging each other to go forth and comment for no other purpose but to raise Page Rank.

How crappy does your site have to be in order for you to spend time and energy on what, when all is said and done, is just link farming? Practically every Drupal deployment I've done over the past few years has risen to the top few results for any reasonably relevant query with virtually no conscious effort towards SEO. (Admittedly these are mostly for locale-specific organisations - take the place name out of the query, and it's a bit trickier.)

Black Hat SEO is what resort to when you're no bloody good at what you do, or when what you do is a scam.

Just try akismet

I suffered the same issues about on year ago. I installed akismet in my own written blog software and I must say it really helps. Spam comments are not only ignored and put in quarantine the amount of spammers which are trying to spam my blog is almost reduced to zero.

It could be easier if I could

It could be easier if I could mail the content of comments to myself, that way I do not have to open the comment, read it.

I initially tried to use triggers/actions to have the contents of new comments emailed to me, but couldn't get it to work for reasons I can't recall. But what I did do - and what works quite well - is use Views to construct a view of recent comments, then subscribe to its feed with my feed reader. Now when I see a new comment come in to my feed reader which is relatively short and contentless ("Great post! That annoys me too!") and/or from someone I don't know, I'll check it out to see if it's spam. On the other hand, if it's from someone I do know and/or seems earnestly written, I know it's safe to just leave it alone.

Of course, this tip prob'ly won't do you much good if you don't regularly use a feed reader.