Email Security
Annoyance Filter
The Annoyance Filter is a program which exploits the indelible signature of advertising to identify it before it ever reaches the eyes of the reader, with a very low likelihood of junk mail being confused with legitimate messages. It accomplishes this by scanning collections of an individual user's mail, the more the better, which have been manually sorted into piles of legitimate mail and junk. From these archives, Annoyance Filter computes statistics for the words which appear in the two collections of messages, determining for each the probability that its appearing in a message is indicative of junk mail. This is the training phase, and results in a dictionary of word probabilities.
With this dictionary in hand, Annoyance Filter may now be used to classify incoming messages as they arrive. Incoming messages are parsed precisely like those used to train the program, and based on the words with the greatest probability of appearing predominately in junk or non-junk, a probability for the message as a whole is computed. This probability is then tested against a threshold which, if exceeded, indicates with a high degree of confidence the mail is junk. Each message is marked with its classification, and what happens from there on is up to you. Unix users of Procmail may easily direct mail Annoyance Filter to deems junk to a suitable destination. But there's nothing Unix- or Procmail-specific about Annoyance Filter. It can be built on any platform with a standard C++ compiler and integrated into any mail system which permits an external program to filter incoming mail. The details, of course, may be complicated, messy, and tedious, but the concept is straightforward.
Loading .....