Spam is flooding the Internet with many copies of the same message, in an attempt to force the message on people who would not otherwise choose to receive it. Most spam is commercial advertising, often for dubious products, get-rich-quick schemes, or quasi-legal services. Spam costs the sender very little to send — most of the costs are paid for by the recipient or the carriers rather than by the sender.
There are two main types of spam, and they have different effects on Internet users. Cancellable Usenet spam is a single message sent to 20 or more Usenet newsgroups. (Through long experience, Usenet users have found that any message posted to so many newsgroups is often not relevant to most or all of them.) Usenet spam is aimed at “lurkers”, people who read newsgroups but rarely or never post and give their address away. Usenet spam robs users of the utility of the newsgroups by overwhelming them with a barrage of advertising or other irrelevant posts. Furthermore, Usenet spam subverts the ability of system administrators and owners to manage the topics they accept on their systems.
Email spam targets individual users with direct mail messages. Email spam lists are often created by scanning Usenet postings, stealing Internet mailing lists, or searching the Web for addresses. Email spams typically cost users money out-of-pocket to receive. Many people – anyone with measured phone service – read or receive their mail while the meter is running, so to speak. Spam costs them additional money. On top of that, it costs money for ISPs and online services to transmit spam, and these costs are transmitted directly to subscribers.
One particularly nasty variant of email spam is sending spam to mailing lists (public or private email discussion forums.) Because many mailing lists limit activity to their subscribers, spammers will use automated tools to subscribe to as many mailing lists as possible, so that they can grab the lists of addresses, or use the mailing list as a direct target for their attacks.
Spam cocktail (or anti-spam cocktail):
A spam cocktail (or anti-spam cocktail) is the use of several different technologies in combination to successfully identify and minimize spam. The use of multiple mechanisms increases the accuracy of spam identification and reduces the number offalse positives.
A spam cocktail puts each e-mail message through a series of tests that provides a numeric score showing how likely the message is to be spam. Scores are computed and the message is assigned a probability rating. For example, it may be determined that a message has 85% probability that it is spam. E-mail administrators can create rules that govern how the messages are handled based on their scores; the highest scores may be deleted, medium scores may quarantined, and lower scores may be delivered but marked with a spam warning.
A spam cocktail commonly includes several of the following identification methods, which may be weighted differently for message scoring:
- Machine learning: Implementing sophisticated computer algorithms that improve over time to analyze the subject line and contents of a message and predict the probability that it is spam based on past results. The Bayesian filter is a type of machine learning.
- Blacklisting: Subscribing to a blacklist or blackhole list of known spammers and blocking messages from those sources
- Content filtering: Using programs that look for specific words or criteria in the subject line of body of a message
- Spam signatures: Using programs that compare the patterns in new messages to patterns of known spam
- Heuristics: Using heuristic programs that look for known sources, words or phrases, and transmission or content patterns
- Reverse DNS lookup: Checking whether the IP address matches the domain namefrom which a message is coming.