These classnotes are depreciated. As of 2005, I no longer teach the classes. Notes will remain online for legacy purposes

UNIX03/Vipul's Razor

Classnotes | UNIX03 | RecentChanges | Preferences

Vipul's Razor is a distributed, collaborative, spam detection and filtering network. The primary focus of the system is to identify and disable an email spam before its injection and processing is complete.

Vipul's Razor (or Razor for short) is somewhat similar to DCC, except that it is more of a true P2P network instead of a series of servers and clients. Think Gnutella for spam checksums, and you start to get the idea.

Razor has recently undergone a massive and much needed rewrite of its underlying protocol. This has resulted in a much more scalable network schema and solved a number of annoyances that have plagued Razor since day one.

Being a P2P network has advantages and disadvantages:

  • Advantages:
    • You potentially have many more other systems to compare checksums with.
    • You have a more non-central network (i.e., no dependence on any real servers that could be shut down).
    • You can act as a "server" (okay, not really in a P2P network, but at least you can provide checksums as easily as anyone else).
  • Disadvantages
    • The network can be much slower (people could be on dial-up, ISDN, T3, DSL, whatever).
    • Networks can split off when connecting nodes shut down to form different "satellite" networks. Thus checksums from one network might not be readily available to another. (By this same logic, it is theoretically possible for checksums or whole ranges of checksums to be lost forever if a substantial sub-network vanishes and never reconnects.)
    • Network traffic can increase.
    • Possible tainting from outside spammers. (Though this really can't be as big a problem as one would think).

Razor sounds like a very good idea (and, in fact, it is... so far you can be garanteed that anything Razor tags as spam is spam), but it is ultimately the poorest performer of everything we've seen thus far. The biggest reason for this is the fact that it can take longer for a given checksum to circulate the Razor network than it can for a spammer to send out zillions of copies of the associated spam. In other words, by the time you get the new checksum, it's probably too late.

Another problem is that in order for Razor to be effective everyone using it must have some sort of a SPAM-trap set up to send uncaught SPAM to. Most people, even if they have set such a trap up, do not get into the habbit of using it (because you can't simply forward spam to it, you must bounce it... and with most mailers, that's not a simple thing to do).


Classnotes | UNIX03 | RecentChanges | Preferences
This page is read-only | View other revisions
Last edited June 7, 2003 12:36 am (diff)
(C) Copyright 2003 Samuel Hart
Creative Commons License
This work is licensed under a Creative Commons License.