Monday, November 23, 2009

Blacklists Compared

SkyHi @ Monday, November 23, 2009
Latest Report

* 14 November 2009

Background
There are two types of DNS blacklists. The older, more common type lists IPv4 addresses as DNS domains, so if a hypothetical DNS list blacklist.example.org were to list the loopback address 127.0.0.1 it could be found by looking for a DNS "A" (address) record for 1.0.0.127.blacklist.example.org. "A" records exist for listed IP addresses, and do not exist for those that are unlisted.

The less common type of DNS blacklist lists domains by name. In this case, a listing of the domain example.com could be found in the hypothetical list by looking for a DNS "A" record for example.com.blacklist.example.org. Again, "A" records exist for listed domains and do not exist for those that are unlisted.

Methodology
Public DNS blacklists are compared here using a sampling method. Each week's sample is extracted from the sendmail logs generated by SDSC's primary mail server, and the data covers a 1-week period beginning and ending each Saturday approximately 4:05 AM US/Pacific time.

The sample of IP addresses used in a comparison report is the entire set of IP addresses that are logged as having made SMTP connections to the mail server during the week. Reverse DNS lookups are done on each of the IP addresses to get a list of domain names, and lookups of those domain names are done in each of the domain name blacklists. The sample of domains used in the last comparison report is the entire set of SMTP envelope sender domains that are logged as having been presented to the mail server during the week. To maintain SDSC e-mail privacy the lists of specific IP addresses and domains are not available to the public.

For each of the IP addresses and domains in the sample lists, lookups are done in each of the DNS blacklists. There are millions of DNS lookups required to complete the survey, and to complete it within a week the work is done using multiple parallel threads.

Conclusions
The results of this survey are not necessarily helpful for choosing a blacklist for the purpose of blocking spam. Such an evaluation would require good data on which of the thousands of e-mail messages handled by SDSC every week are truly spam and which are not, but if it were easy to automatically tell the difference we would just block the spam and not worry about blacklists. Without that data there can be no objective measure of a blacklist's effectiveness for blocking spam.

This survey also does not attempt to measure the quality of blacklists in terms of erroneous listings (false positives) nor in terms of missing entries (false negatives) with respect to each blacklist's policy. To do so would require maintaining my own lists for comparison with the other blacklists, which would take far more effort than I am willing to expend.

That said, it is my subjective opinion that most of the blacklists surveyed here are at least somewhat useful for blocking spam. There is a tendency for those with the highest number of hits to list many places that send substantial quantities of non-spam, and at the other end of the scale some blacklists (such as those listing specific ISPs) are so narrowly focused that they are good for blocking a relatively small fraction of spam with a correspondingly small probability of false positives. To the best of my knowledge all of the blacklist operators make an honest effort to maintain their listings according to their published policies, but the blacklists that cover less than 1% of the survey space seem to be too ineffective to be worthwhile. Your mileage may vary.

For the record, SDSC uses these blacklists:

* dsn.rfc-ignorant.org
* zen.spamhaus.org
* dul.dnsbl.sorbs.net
* bl.spamcop.net

to great effect with only a few mitigating whitelist entries. A local blacklist with several thousand entries and some custom sendmail rules round out our automated spam blocking efforts.