Archive: Fresh Spam for Everyone

Is your spouse dissatisfied with the size of your spam? A brand-new website has made several hundred thousand pieces of unsolicited commercial e-mail available for you to download today. Act now! After a quiet online debut last week, the Spam Archive is making quick strides toward becoming the largest public library of junk e-mail on […]

Is your spouse dissatisfied with the size of your spam? A brand-new website has made several hundred thousand pieces of unsolicited commercial e-mail available for you to download today. Act now!

After a quiet online debut last week, the Spam Archive is making quick strides toward becoming the largest public library of junk e-mail on the Internet.

Paul Judge, director of research and development for CipherTrust, the e-mail security firm backing the project, says the site received roughly 5,000 forwarded messages a day during its first week.

He predicts the archive will amass a corpus of 10 million unsolicited commercial e-mails over the next year. The archive's FTP site will begin to make its spam available, 10,000 at a time, starting Dec. 4.

People have never been so excited to get junk e-mail.

"Its sheer size will make it an invaluable tool," said programming language designer Paul Graham, who first made an open call for such an undertaking in his widely circulated treatise on spam filtering, A Plan For Spam, published online in August.

Filter builder William Yerazunis applauds the undertaking. He says antispammers need a common source of fresh spam.

"I don't retain spam that's over a month old," he said. "Spam has the same shelf life as fresh food."

Yerazunis created CRM114, a remarkably accurate filter, using his own private junk mail stash. But he said the archive will forward filter research.

"You have to have repeatability" in producing and testing antispam software, he said. "It's absolutely necessary for good science to get done."

Although a bevy of newsgroups and individual archives have been gathering spam for years, experts say they are too small and disorganized to provide researchers with significantly meaningful data.

On the other hand, the FTC maintains an enormous database of spam that sees 40,000 new e-mails every day.

But the commission's interest extends only to spam that could be useful in law enforcement investigations, and the archive is not available to the public.

Besides, said Graham, "the federal government is not the most useful place to send anything."

Not that the Spam Archive doesn't face challenges.

While a handful of experts and analysts have applauded the project, the reaction in chat rooms and on weblogs has been muted.

"There's absolutely no reason to believe that the spams collected here will be any 'better' a sample than those collected by opening a random Hotmail account," read one posting on Slashdot.

Others questioned CipherTrust's motives, obliquely suggesting that e-mails forwarded to the archive might be used diabolically to identify and badger antispammers.

CipherTrust's director of marketing Matt Anthony insists that the archive goes "above and beyond" the interests of the company, and guarantees that all contributions will be "anonymized" before they are released to the public.

Some were concerned that spammers could sound the copyright cry. Could the archive get dragged into court for illegally reprinting spammers' messages?

"My understanding is that the copyright for a piece of e-mail remains with the sender," said Yerazunis. "But I might be wrong."

Joyce Graff, a vice president at Gartner2G, maintained that "nobody can claim ownership and constrain you from publishing" spam that is sent to you.

What's more, she said, spammers aren't apt to come forward.

"The spammer doesn't want himself to be known," said Graff. "It's like someone saying, 'You copied my murder tactic.'"

"That's an issue that we definitely still need to have some conversations about with our legal counsel," admitted CipherTrust's Judge. "There are questions around that."

For now, the Spam Archive will continue collecting the Internet's cybergarbage to keep filter makers in fresh spam.

Graham, who is organizing the first ever spam-filtering conference at MIT in January, remains optimistic.

"Interesting things will be happening with spam in the next year," he said. "It's not going to be a good time to be a spammer."

When Everything Was Spam to ISP

FTC: Where Spam Goes to Die

Suing Spammers for a Good Cause

When the Spam Hits the Blogs

Read more Technology news

Read more Technology news

You know IT/IS Important