Distributed Checksum Clearinghouses Reputations

The current version of the DCC source is version 2.3.169, March 22, 2024.

Introduction

The Distributed Checksum Clearinghouses or DCC is an anti-spam content filter that is based on a distributed database that collects real time reports about mail from a global network of servers. Each report consists of several checksums. The most important checksums are of message bodies, the IP address of the SMTP client or mail sender, bits of envelope, and so forth. Each time a DCC client sends a report to a DCC server, the server records the report (modulo data reduction/compression) and answers with how many times it has heard of each checksum in the report. A DCC client can take that answer and decide "this message is bulk because the DCC network has heard of it more than X times."

DCC Reputations

click for more graphs

A DCC server also computes reputations by counting the total number of mail messages sent from particularly active IP addresses, and the number of bulk messages each sends. The percentage of bulk mail seen from an IP address is its DCC reputation. For example, a DCC Reputation of 50% means that 50% of the mail that the global DCC Reputation network has seen from an IP address has been bulk. Because the DCC network is not omniscient, the DCC Reputation of an IP address tends to understate the probability that the next message from an IP address will be bulk.

Reputations are flooded among DCC reputation servers along with DCC checksums. Thus DCC Reputations are like DCC counts, more reliable as more systems participate.

DCC reputation servers detect bulk mail to compute reputations using counts of DCC body checksums reported to all DCC servers in the global network of DCC servers. Mail messages rejected because of a DCC reputation are reported a second time to the global network with counts of MANY. This ensures that the messages will be rejected if sent from some other IP address to any mail system using a DCC client.

Like any reputation system, DCC reputations can have false positives. They can react more quickly than manual DNS blacklists to the appearance of new "trojan proxies" and "cracked" PHP-Nuke sites. DCC Reputations are less effective than greylisting, but many sites are unable to use greylisting. It is profitable to use both greylisting and DCC Reputations.

Mechanisms

DCC clients add the string bulk rep to X-DCC headers of mail messages that are not bulk mail but that come from IP addresses with reputations worse than a local configured threshold. There are two thresholds for DCC reputations. Unless they are set, bulk rep is not added by DCC clients and mail is not rejected, but DCC servers accumulate data. Rep-total is the minimum number of mail messages, good as well as bulk, that must have been seen to compute a DCC reputation. This threshold is needed to avoid computing a reputation based on only a few mail messages. Rep is the percentage of bulk mail that gives a DCC Reputation for sending bulk mail.

Because reputations can involve more false positives, dccm and dccifd do not reject mail unless allowed by DCC-reps-on in the client whitelist file. That setting is normally in per-user whiteclnt files, but can be in the system's global /var/dcc/whiteclnt file. The proof of concept CGI scripts in the source include support for turning DCC reputations on and off and setting the Rep threshold by individual end users for their own mail.

DCC Reputations can be turned off for clients using a given client-ID by add "no-reps" to the line for that client-ID in /var/dcc/ids on all DCC servers used by clients with that ID.

MX Servers

DCC Reputations are about IP addresses, and so it is important for the system to recognize an installation's MX servers or mail systems receive mail from the Internet and forward it internally and not blame them for spam. Their IP addresses must be listed in the global whiteclnt file with MX or MXDCC entries.

Configuring DCC Reputations

Because any sort of a bad reputation is not a guarantee of bad behavior, rejecting, discarding, or segregating mail from IP addresses with bad reputations (not just DCC Reputations) in "junk folders" can result in false positives. So DCC Reputations must be explicitly turned on for all mailboxes that should have DCC Reputation filtering on DCC clients, or mail systems using dccproc, dccifd, or dccm. To enable DCC Reputations:

Ensure that a recent version of the DCC software is installed on the DCC server and client systems.
Whitelist IP addresses by adding lines to DCC Reputation client system's main /var/dcc/whiteclnt file similar to:

ok ip 127.0.0.1
ok ip 10.1.2.0/24
for IP addresses that are trusted to send no unsolicited bulk email
mx ip 10.2.3.0/24
mx ip 10.2.4.1
for the IP addresses of MX secondaries or other forwarders of email that do not use DCC or DCC Reputation filtering
mxdcc ip 10.4.5.0/24
for the IP addresses of MX secondaries and other forwarders of email that use DCC or DCC Reputation filtering
submit ip 10.6.0.0/16
for the IP addresses of mail systems that are not trusted to not send spam and that cannot handle SMTP 4yz temporary rejections such as for greylisting. Personal computers using web browsers for mail user agents (MUAs) are very common examples of such systems.
Enable reputation filtering by specifying DCC Reputation thresholds and turning them on.
The default value for the Rep-total threshold rarely needs to be changed and so should usually not be set so that changes to the default in future versions will be effective. The second threshold, Rep, should be set in a whiteclnt file used for the mailboxes that should have DCC Reputation filtering.
If all mailboxes on a mail system should use DCC Reputation filtering and have the same thresholds, then lines like the following should be added to main /var/dcc/whiteclnt file:
```
option DCC-reps-on
option threshold rep,20%
```
If the system has whiteclnt files for individual mailboxes, suitable lines should be added to them. The settings for each mailbox are derived by first applying dccm -t or dccifd -t command line settings from REP_ARGS value in the /var/dcc/dcc_conf file, overriding those values with settings from the main /var/dcc/whiteclnt file, and finally applying any settings found in a per-user whiteclnt file, if any.
To choose a reputation threshold, consider the Mail with DCC Reputations graphs. The narrow band of yellow, pink, blue and red shows that most IP addresses have DCC Reputations that are either <1% or >60%. It makes little sense to use DCC Reputation filtering with a threshold below about 10% or above 60%. 40% or 50% is conservative threshold and 20% is somewhat aggressive.
Point clients to DCC Reputation servers
Ensure that the /var/dcc/map file on all of your DCC clients (systems running dccm or dccifd) points only to DCC Reputation servers and with valid client-IDs. The public DCC servers do not answer requests from DCC Reputation clients. Something like the following can be used to add to the map file:
```
    cdcc "add dcc.example.com  RTT-1000 32768  passwd1"
    cdcc "add com-dcc.rhyolite.com      XXXXX  passwd2"
```
where 32768 and passwd1 are a local client-ID and matching password in the /var/dcc/ids file on your DCC Reputation servers and where XXXXXX and passwd2 are the client-ID and password for use with the backup DCC Reputation servers. passwd2 is usually the same as the PASSWD used with /var/dcc/libexec/updatedcc.

Query the DCC Reputation Database

Contact Vernon Scrhyver of Rhyolite Software, LLC at vjs@rhyolite.com or use the form

script $Date: 2019/09/27 13:17:20 $