I just sent the following letter to suggestions@google.com. If anybody knows anybody who works there, please feel free to forward it along and/or let me know, so I can.
Over the past weekend there has been an outbreak of blog comment spamming demonstrating that someone has taken the tactic to a new level of automation. Comment spamming is an attempt to game Google's PageRank algorithm by placing links to the spammer's site across many pages on many independent websites, thus raising Google's estimation of the importance of the target page. We believe Google and weblog owners share a common interest in neutralizing this tactic before it takes hold.
My own, quite obscure blog received identical comments autoposted to twenty different articles, each containing the innocuous text "Hello Folks,nice site youre running!", signed with the name "Preteen" and the URL of a porn site. The effect of this was to create twenty links to this porn site scattered through my pages, each with the link text "Preteen".
I was far from the only one affected: many popular sites were hit, including Making Light, Talk Left, Crooked Timber, Samizdata, Winds of Change, and Electrolite. The effect on Google PageRanks is multiplied by the number of Movable Type blogs the spammer is able to find.
As large-scale as this was, it was clearly still a probing attack -- and yet, to see how bad the problem already is, look here, which as of this writing is a Movable Type entry archive page with no less than twenty-three spam comments already attached.
Many people are busy at work on defence strategies, including Jay Allen (MT-Blacklist), Yoz Grahame, and Shelley Powers. But it is also clear from the e-mail spam wars that builders of anti-spam walls and locks are at real disadvantage against the attackers.
This is where Google and other search engines come in. The motivation for comment spammers is to try to game Google's rankings. Weblog owners and Google share a desire to defeat this -- take away the spammers' motivation, and we won't have to fight a war for control of our comment sections.
Our proposal is this: It would be trivial for blog maintainers to explicitly mark our comment sections, for example with HTML comments: <!-- SearchEngine: Begin Anonymous Comment --> . . . <!-- SearchEngine: End Anonymous Comment -->, or in any other fashion convenient to your parsing engine. These markers would delimit untrustworthy material that the page author has had no editorial control over; Google could thus unambiguously assign it no weight in PageRank.
We present this proposal in the belief that Google shares our interest in stopping comment spamming. Obviously, none of us understand the internals of how your engine works, and perhaps HTML comments are not the best way to label content for you. We simply hope this proposal will start a discussion about the best approach to the problem -- we are volunteering to do our part in whatever the right solution turns out to be.
Sincerely,
Colin Roald
James D. Macdonald
Update (10/20): Just received the following response from Google:
Date: Mon, 20 Oct 2003 09:56:05 -0700 From: bizdev@google.com Subject: Re: [#4368400] A Proposal concerning Weblog Comment Spam
Hi Colin and James,
Thank you for your report of the situation and your proposal. We have passed it onto the appropriate member of our team for review. We really appreciate thoughtful feedback from our users, and we'll keep yours in mind as we work to improve Google.
Regards, The Google Team
It is very difficult to send just a "thank you" to Google. There is so much other things to contend with, but that is what I want to do. Google is the only place I can find the real news. Thank you!
Posted by: Jean Parker on January 12, 2004 07:35 PMI think some of your include files are messed up... Maybe you are just doing some work on the site.
Posted by: Vertex on January 23, 2004 02:10 PMHumm after I submitted my comment it is working fine... Odd.
Posted by: Vertex on January 23, 2004 02:13 PM