Spam server details

We have a few domains that used to be used for free email, and have been closed down for quite some time (18 months in the case of altavista.com). They have been heavily abused by spammers. I was tasked with setting up two sendmail servers to catch the mail to postmaster and bounce the rest. Two alpha servers had been doing this job in the NY data center, but that facility closed down.

Here are the details of how mail is set up on mail8 and mail9.

DNS

There are many domains we own that are not used for email. For all of these, we want to set up DNS to point to our “spam servers” so that the mail for postmaster will be saved, and the rest of the mail can be bounced. Since there is a very large volume of mail, this function needed to be kept separate from our normal mail servers for corporate, business, customer care etc.

For example the domain altavista.com has this MX record:

altavista.com.          IN      MX      10 deadmail-vip.sv.av.com.
deadmail-vip.sv.av.com. IN      A       209.73.164.45

Some of these domains already had a zone file set up, so the MX and TXT information could be edited into their files using the normal process. In addition to the 8-10 domains that have their own zone files, there are also 900+ “generic” domains that just have NS and SOA, and A records at the base and at “www”. The MX and TXT information for these is all taken from the same file, generic.zone, so that only needed to be modified in one place.

The list of “generic” domains is read when we run “make” and converted in to named.conf files for the primary and secondary DNS servers. Here is an example.

zone "alsavista.com" in { type master; file "av-ns/generic.zone"; };   # in primary file
zone "alsavista.com" in { type slave; file "secondary/alsavista.com"; masters { 209.73.164.34; }; };  # in secondary file

SPF

We also want the DNS TXT records to communicate to other mail administrators that the domain is not used for email and the messages claiming to be from the domain are forged. We use an SPF TXT record to do this, along with human-readable text to explain and inform.

altavista.com.          600     IN      TXT     "This domain sends no email"
altavista.com.          600     IN      TXT     "Null SPF is for tracking purposes only"
altavista.com.          600     IN      TXT     "All mail claiming to be from altavista.com is forged"
altavista.com.          600     IN      TXT     "v=spf1 +exists:CL.%{i}.FR.%{s}.HE.%{h}.null.spf.altavista.com -all"

In addition to informing SPF-aware servers that the mail should be bounced, the SPF record also provides a way to track who is trying to forge our domain name in email (provided the receiving server understands SPF). The zone “spf.altavista.com” has been directed to a single DNS server with logging turned on. (This is not immediately relevant to the “spam server” but provides more interesting data.)

Load balancing/VIP

The VIP is set up with Big/IP (F5) and listens on port 25. Big/IP checks port 25 on each mail server and only sends traffic to the servers that are up.

If there are too many connections, the default is to send an alert via syslog saying “No nodes available for VIP.” In our case, we expect the server to get overloaded, so we allow no more than 1000 connections at once – 500 to each server. I defined another “server” at a lower priority so that any traffic over the 1000 simultaneous sessions allowed is directed to port 9999 of one of the servers – a quick and dirty way to give them “connection refused” with no errors generated. (Currently we are running at capacity 24×7 – spammers use up all 1000 simultaneous connections and the “connection refused” is given from 40% to 60% of the time. :)

Sendmail

For this project, Sendmail was chosen as the mailer program, because it was quick to set up and I was already familiar with it. It is not the best in terms of performance, but that was not a primary concern on this project. (If I were to create a similar server in the future I would probably recommend either Postfix or Qmail and perhaps run some performance tests with both.)

Sendmail was installed via RPM from a RedHat update. Additional components need adjusting: see Virtusertable, Aliases, Access, and MC/CF.

Virtusertable

The virtusertable needs to list all domains that will have their mail directed here. Since we already had a method of adding “generic” domains in a hurry, we used this list to add domains to the virtusertable as well. For each generic domain, two lines are added in virtusertable.appendix similar to:

postmaster@alsavista.com postmaster@av.com
@alsavista.com 554 This domain is not used for email

The regular “virtusertable” is edited in the normal way and checked out/in by RCS. This is actually the same virtusertable we use for production/corporate mail. For the 8-10 domains that are not used for email, but are not on the generic DNS config, the appropriate lines are added to the regular virtusertable. If you want to save any other mail besides “postmaster” for these domains, this is the place to add those redirects.

Aliases

For our application, aliases does not need to be very large. Aliases is needed for any entry that expands to multiple addresses; entries that redirect to a single address only can be dealt with in virtusertable only.

For this purpose we used the same config as the main corporate mail servers. (This leaves us the option of combining the functions of both servers later — the config and data files are the same, only the DNS information is changed to direct the mail to different servers.)

Access – Spfilter

spfilter was chosen as a quick, easy method of getting a local copy of several spam blocklists. The update runs twice a day from crontab on each mail machine. It uses rsync and/or http to get the appropriate sources, combine them into Sendmail access format, and remove/combine duplicates.

After fetching the updates, the output file is used to create access.db, combining with our own access file (where we can identify local ip ranges to permit relaying, or add our own block entries)

spfilter is installed by downloading its Makefile and running “make all”

# mkdir /usr/local/adm/spfilter/
# cd /usr/local/adm/spfilter/
# wget "http://spfilter.sourceforge.net/spfilter-0.59/Makefile"
# make all

We also need a couple perl modules to make it work right:

# perl -MCPAN -e shell
cpan> install XML::Simple
cpan> install XML::Parser
cpan> exit

Finally, I set up a quick shell script to call spfilter and update access.db:

#!/bin/sh
umask 077
cd /local/spfilter
perl ./spfilter.pl -f sendmail SPAM_ALL,RELAY_ALL,DYNAMIC_ALL  || exit 15;
cat /etc/mail/access outdir/SPFILTER.sendmail 
        | makemap hash outdir/SPFILTER.sendmail 
        && cp outdir/SPFILTER.sendmail.db /etc/mail/access.db
/sbin/service sendmail restart
/sbin/service sendmail-catchall restart

After the script runs once successfully, it is scheduled to be run twice daily. Repeat the install process on the other server as well.

Alter the Makefile in /etc/mail so that if we modify access, the same build is repeated. Our changes to access will take place right away without losing the info from spfilter.

access.db : access /local/spfilter/outdir/SPFILTER.sendmail
        cat access /local/spfilter/outdir/SPFILTER.sendmail | makemap hash access

MC/CF

The normal RedHat mc file is probably sufficient, but I have added a couple things to optimize it for our purposes. Edit the mc file appropriately and run it through m4 to produce the cf file.

Some things we added:

dnl # This makes it easier to make the CF so you don’t have to type this to m4 each time.
include(`../m4/cf.m4′)

dnl # No message should have more than 4 recipients (since only postmaster is valid anyway)
define(`confMAX_RCPTS_PER_MESSAGE’, “4”)dnl
dnl # If they make more than four bad guesses, delay 1 sec for each RCPT command after 4
define(`confBAD_RCPT_THROTTLE’, “4”)dnl

dnl # Load average should never get up this high due to throttling at the load balancer. If it does, we want to refuse rather than queue-only
define(`confQUEUE_LA’, `100′)dnl
define(`confREFUSE_LA’, `40′)dnl

dnl # delay_checks can be used to white list certain cases, such as allowing mail to postmaster even if other mail is blocked,
dnl # or to get more information from the spammer before cutting him off
dnl # In this case we have it turned off because it improves performance to cut the connection ASAP.
dnl FEATURE(`delay_checks’)

dnl # LOCAL_CONFIG defines variables we will use in local rule sets, and some other stuff.
LOCAL_CONFIG
# Process simple arithmetic
Karith arith
# Allow for storing values parsed in one place, to be
# retrieved in another
Kstorage macro
# Allow for extra lines to /var/log/maillog
Ksyslog syslog

dnl # LOCAL_RULESETS are for extra sendmail tweaks if needed. In our case we reject hosts
dnl # – with no PTR for their IP,
dnl # – bad PTR (meaning it doesn’t exist forward, or doesn’t resolve back to client IP)
dnl # – temp failure looking up PTR or verifying it (like specifying a non-existent DNS server)

LOCAL_RULESETS

dnl # Local_check_rcpt is most useful when delay_checks is on. This line logs all connections that get to the RCPT stage.
dnl # Commented because we have delay_checks off and don’t want to add extra stuff to the log file
SLocal_check_rcpt
#R$* $(syslog “check_rcpt: rdns=”$” ip=”$” forgedmailfrom=”$$) $1

dnl # Local_check_relay is a good place to check for Reverse-IP because it happens early in the game.
SLocal_check_relay
# check_relay is passed $$| $# So, first, we turn it into parts
R$* $| $* $: $1 $| $2 $| $$| $| $&_
R$* $| $* $| $* $| $| $* @ $* [$*] $: $1 $| $2 $| $3 $| $4 $| $5 $(storage {client_ident} $4 $)
$(storage {client_dnsname} $5 $)
# Now we have confirmed client name before first $|
# client address before 2nd $|
# OK FAIL TEMP or FORGED before 3rd $|
# The ident name (if any), before the 4th $|
# and trailing, the rDNS name, whether it’s FCrDNS or not
##R$* $| $* $| $* $| $* $| $* $: $(syslog “rdns info fcrdns=” $1 ” ip=” $2 ” status=” $3 ” ident=”
$4 ” rdns=” $5 $) $1 $| $2 $| $3 $| $4 $| $5

# Now we can match on either Ident (if avail) or rdns status (OK FAIL TEMP or FORGED)
R$* $| $* $| $* $| CacheFlowServer $| $* $#error $@ 5.7.1 $: “554 PROXYIDENT Your ident ” $4 “indicates proxy abuse”
R$* $| $* $| $* $| squid $| $* $#error $@ 5.7.1 $: “554 PROXYIDENT Your ident ” $4 “indicates proxy abuse”
R$* $| $* $| TEMP $| $* $#error $@ 4.7.1 $: “454 TEMPrDNS Failure checking PTR record for ” $1
R$* $| $* $| FAIL $| $* $#error $@ 5.7.1 $: “554 NOPTR No PTR record exists for ” $1 ” please fix reverse DNS”
R$* $| $* $| FORGED $| $* $| $* $#error $@ 5.7.1 $: “554 BADPTR PTR record for ” $1 ” must forward resolve back to same IP”

Extra “spam trap” config

We could have stopped here to get a perfectly acceptable relaying mail server. The extra configuration below allows me to “trap” the incomining mail that would normally get “User unknown” — instead of bouncing, we deliver the mail to procmail and spamassassin for further analysis. The remaining components are just for the “spam trap” part of things.

Virtusertable

Another sendmail process is running on port 2525. Currently it is set to start manually only – if the machine is rebooted and nobody starts the alternate sendmail on 2525 then the catchall will not get any traffic. For our purposes this is acceptable; if I am not taking care of the server personally, there will probably be no need to trap the spam.

A different virtusertable is needed, instead of the standard “User unknown” message, we accept the mail and deliver it to local user “admmail”.

postmaster@alsavista.com postmaster@av.com
@alsavista.com catch-all

And in aliases:
catch-all: admmail

I modified the MC file for the second sendmail process, so that it listens on 2525, and doesn’t listen on 25 or 587, and uses a different pid file. And, most important, it uses the new virtusertable:
define(`confPID_FILE’, `/var/run/sendmail-catchall.pid’)dnl
FEATURE(`no_default_msa’)
DAEMON_OPTIONS(`Port=2525, Name=MTA-2525, M=E’)
FEATURE(virtusertable,`hash /etc/mail/virtusertable.catchall’)

I also created an init script that uses the right pid file and config by modifying these lines:

prog="sendmail-catchall"
        echo -n $"Starting $prog: "
        daemon --check=$prog /usr/sbin/sendmail -bd -C /etc/mail/sendmail-catchall.cf
        RETVAL=$?
        echo
        [ $RETVAL -eq 0 ] && touch /var/lock/subsys/sendmail-catchall
...
        echo -n $"Shutting down $prog: "
        killproc sendmail-catchall

Spamassassin

I installed Spamassassin on each box via RPM, but this can also be done with CPAN.

I modified Spamassassin start script (actually, just created a data file for it that it was already checking for)
/etc/sysconfig/spamassassin

SPAMDOPTIONS="-d --username=admmail --socketpath=/tmp/spamd-socket  "

spamc is used in procmail to open a connection to an already-running spamd. Spamd runs more processes by itself if needed, but in our case there will never be more than 10 messages processed at one time.

I also added the following to user admmail:
~admmail/.spamassassin/user_prefs
required_hits 5
rewrite_subject 0
report_header 0
use_terse_report 1
skip_rbl_checks 1
report_safe 1
use_bayes 0
bayes_auto_learn 0
use_auto_whitelist 0

Procmail

Local user “admmail” is added, and its home directory is symlinked to elsewhere, where there is space:
/udir/admmail -> /local/junkmail

.procmailrc is created in that directory. Our goal here is to take a small percentage of the incoming load (whatever comes to port 2525) and check it for spam, and log it.

MAILDIR=/local/junkmail/junk/
TODAY=`date +%Y-%m-%d`
LOGFILE=/local/junkmail/junk/procmail.log.$TODAY

:0
* Return-Path: <MAILER-DAEMON@
mailer-daemon-$TODAY

:0
* > 256000
/dev/null

SOURCE=`formail -x Received: -c | grep mail8 | head -1 | sed -e 's/.*from ([^(]*) ((.*)).*by mail8.pv.sv.av.com.*/2 HELO: 1/' `
SOURCEIP=`formail -x Received: -c | grep mail8 | head -1 | sed -e 's/.*from [^(]* (.*[(.*)].*).*by mail8.pv.sv.av.com.*/1/' `

:0fw
| /usr/bin/spamc -t 90 -U /tmp/spamd-socket

:0
* X-Spam-Flag: YES
{
  LOG="SPAM FROM IP: $SOURCE"
  MKDIR=`mkdir -p spam/$TODAY`

  :0
  spam/$TODAY/ip.$SOURCEIP
}

:0
unknown-$TODAY

mkdir junk (or whatever we set MAILDIR to above). Log files and mailbox files will be created in that directory automatically.

Analysis scripts

A script called “breakdown.csh” shows the summary of messages sent to the spam trap.

Yesterday:
      4   Folder: /dev/null
   2896   Folder: mailer-daemon-2004-02-10
   2973   Folder: unknown-2004-02-10
   1726   Folder: spam/2004-02-10/
   7600

Today:
     42   Folder: mailer-daemon-2004-02-11
     19   Folder: spam/2004-02-11/ 
     37   Folder: unknown-2004-02-11
     98

A script called “Top-IPS” shows the sources of verified spam by source IP address.

mail8# top-ips.csh 
      2 rrcs-nys-24-97-21-194.biz.rr.com [24.97.21.194]
      2 mx022.wizardnation.com [69.1.234.152]
      2 dsl-0-198.sg-b.tiscali.no [82.164.0.198]
      2 ds-20.syndicatesales.biz [205.252.101.202]
      2 ds-10.syndicatesales.biz [205.252.102.101]
      1 zeus.royaume.com [207.253.118.5]
      1 pilot11.cl.msu.edu [35.9.5.31]

A script called “bl-stats” shows the last 10000 rejects (or whatever number you type) and summarizes the reject cause for them.

mail8# bl-lastN.csh 10000
Last 10134 rejects
   3630 ADSL/CABLE/DIALUP
   3318 NOPTR
    986 TEMPrDNS
    886 BADPTR
    607 SBL
    197 SPAM
    178 SORBS
     93 PROXY
     60 BULK
     50 Domain of sender address doesn't revole
     42 BLOCK
     28 SPEWS
     23 We don't accept mail from spammers (local block list)
     20 PINK
      8 HOST
      3 Relaying denied
      3 Flonetwork
      1 spamsites.org
      1 BOGON

Most of these are block lists. 3318 NOPTR 986 TEMPrDNS 886 BADPTR are reverse DNS results. A couple entries are sendmail default like Relaying denied or Domain of sender address doesn’t resolve.

End of summary. Feedback welcome to gconnor@av.com.

Leave a Reply