After building up a catalog of 951 good messages and 388 spam messages, Bayes is getting quite good at categorizing my mail.
This is out of a set of 1339 messages classified by me, by hand.
neko-base> foreach j (BAYES_{01,10,20,30,40,50,60,70,80,90}) foreach? echo -n $j foreach? egrep $j mail/track-spam|wc -l foreach? end BAYES_01 2 BAYES_10 0 BAYES_20 2 BAYES_30 6 BAYES_40 0 BAYES_50 0 BAYES_60 22 BAYES_70 32 BAYES_80 31 BAYES_90 88 neko-base> foreach j ( BAYES_{01,10,20,30,40,50,60,70,80,90} ) foreach? echo -n $j ; egrep $j mail/track-good|wc -l foreach? end BAYES_01 64 BAYES_10 12 BAYES_20 11 BAYES_30 8 BAYES_40 0 BAYES_50 0 BAYES_60 0 BAYES_70 0 BAYES_80 0 BAYES_90 0 neko-base> grep -c BAYES mail/track-* mail/track-good:101 mail/track-spam:183
Overall here is the distribution of actual spamassassin scores. (Spamassassin was not used on all 1339 messages :)
neko-base> egrep "^X-Spam-Level:" mail/track-good | sort | uniq -dc ; egrep "^X-Spam-Level:" mail/track-good | wc -l 108 X-Spam-Level: 4 X-Spam-Level: * 6 X-Spam-Level: ** 3 X-Spam-Level: *** 2 X-Spam-Level: **** 123 neko-base> egrep "^X-Spam-Level:" mail/track-spam | sort | uniq -dc ; egrep "^X-Spam-Level:" mail/track-spam | wc -l 3 X-Spam-Level: 3 X-Spam-Level: * 6 X-Spam-Level: ** 9 X-Spam-Level: *** 6 X-Spam-Level: **** 13 X-Spam-Level: ***** 5 X-Spam-Level: ****** 17 X-Spam-Level: ******* 7 X-Spam-Level: ******** 16 X-Spam-Level: ********* 8 X-Spam-Level: ********** 7 X-Spam-Level: *********** 9 X-Spam-Level: ************ 7 X-Spam-Level: ************* 8 X-Spam-Level: ************** 9 X-Spam-Level: *************** 8 X-Spam-Level: **************** 5 X-Spam-Level: ***************** 7 X-Spam-Level: ****************** 6 X-Spam-Level: ******************* 3 X-Spam-Level: ******************** 8 X-Spam-Level: ********************** 2 X-Spam-Level: *********************** 2 X-Spam-Level: ************************ 178