I just sent the following letter to
If anybody knows anybody who works there, please feel free to forward it along and/or
let me know, so I can.
Over the past weekend there has been an outbreak of blog comment spamming demonstrating that someone has taken the tactic to a new level of automation. Comment spamming is an attempt to game Google's PageRank algorithm by placing links to the spammer's site across many pages on many independent websites, thus raising Google's estimation of the importance of the target page. We believe Google and weblog owners share a common interest in neutralizing this tactic before it takes hold.
My own, quite obscure blog received identical comments autoposted to twenty different articles, each containing the innocuous text "Hello Folks,nice site youre running!", signed with the name "Preteen" and the URL of a porn site. The effect of this was to create twenty links to this porn site scattered through my pages, each with the link text "Preteen".
I was far from the only one affected: many popular sites were hit, including Making Light, Talk Left, Crooked Timber, Samizdata, Winds of Change, and Electrolite. The effect on Google PageRanks is multiplied by the number of Movable Type blogs the spammer is able to find.
As large-scale as this was, it was clearly still a probing attack -- and yet, to see how bad the problem already is, look here, which as of this writing is a Movable Type entry archive page with no less than twenty-three spam comments already attached.
Many people are busy at work on defence strategies, including Jay Allen (MT-Blacklist), Yoz Grahame, and Shelley Powers. But it is also clear from the e-mail spam wars that builders of anti-spam walls and locks are at real disadvantage against the attackers.
This is where Google and other search engines come in. The motivation for comment spammers is to try to game Google's rankings. Weblog owners and Google share a desire to defeat this -- take away the spammers' motivation, and we won't have to fight a war for control of our comment sections.
Our proposal is this: It would be trivial for blog maintainers to explicitly mark our comment sections, for example with HTML comments: <!-- SearchEngine: Begin Anonymous Comment --> . . . <!-- SearchEngine: End Anonymous Comment -->, or in any other fashion convenient to your parsing engine. These markers would delimit untrustworthy material that the page author has had no editorial control over; Google could thus unambiguously assign it no weight in PageRank.
We present this proposal in the belief that Google shares our interest in stopping comment spamming. Obviously, none of us understand the internals of how your engine works, and perhaps HTML comments are not the best way to label content for you. We simply hope this proposal will start a discussion about the best approach to the problem -- we are volunteering to do our part in whatever the right solution turns out to be.
Update (10/20): Just received the following response from Google:
Date: Mon, 20 Oct 2003 09:56:05 -0700 From: firstname.lastname@example.org Subject: Re: [#4368400] A Proposal concerning Weblog Comment Spam
Hi Colin and James,
Thank you for your report of the situation and your proposal. We have passed it onto the appropriate member of our team for review. We really appreciate thoughtful feedback from our users, and we'll keep yours in mind as we work to improve Google.
Regards, The Google Team