Discussion:
MailScanner + Bayes on SQL
Kai Schaetzl
2006-05-15 05:15:13 UTC
Permalink
Is this at all possible with MailScanner? I followed the instructions on
the spamassassin wiki and migrated a Bayes database to MySQL. When I test
via spamassassin --lint I get this output:

[12153] dbg: bayes: using username: root
[12153] dbg: bayes: database connection established
[12153] dbg: bayes: found bayes db version 3
[12153] dbg: bayes: unable to initialize database for root user, aborting!

This doesn't happen with the system wide setting of bayes_path, the user
doesn't matter in this case. How can I do this when using MailScanner?
Is it that configuration variable?

bayes_sql_override_username someusername
From the description I'm not 100% sure.
http://wiki.apache.org/spamassassin/BetterDocumentation/SqlReadme

Also, if that is the correct way to do it, what username do I use?


Kai
--
Kai Sch?tzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
Dhawal Doshy
2006-05-15 12:39:50 UTC
Permalink
Post by Kai Schaetzl
Is this at all possible with MailScanner? I followed the instructions on
the spamassassin wiki and migrated a Bayes database to MySQL. When I test
[12153] dbg: bayes: using username: root
[12153] dbg: bayes: database connection established
[12153] dbg: bayes: found bayes db version 3
[12153] dbg: bayes: unable to initialize database for root user, aborting!
This doesn't happen with the system wide setting of bayes_path, the user
doesn't matter in this case. How can I do this when using MailScanner?
Is it that configuration variable?
bayes_sql_override_username someusername
precisely.. See,
http://wiki.mailscanner.info/doku.php?id=documentation:anti_spam:spamassassin:bayes:sql
Post by Kai Schaetzl
From the description I'm not 100% sure.
http://wiki.apache.org/spamassassin/BetterDocumentation/SqlReadme
Also, if that is the correct way to do it, what username do I use?
mysql> SELECT id, username, spam_count, ham_count, token_count FROM
bayes_vars;

- dhawal
Post by Kai Schaetzl
Kai
Kai Schaetzl
2006-05-16 04:31:34 UTC
Permalink
Post by Dhawal Doshy
precisely.. See,
http://wiki.mailscanner.info/doku.php?id=documentation:anti_spam:spamassassin:bayes:sql
Ah, thanks, seems I read the wrong wiki :-)
Post by Dhawal Doshy
mysql> SELECT id, username, spam_count, ham_count, token_count FROM
bayes_vars;
Seems to be the one that's also proposed in the wiki: root.

I'm still waiting that the --restore finishes, I've got quite a few tokens .... One caveat
I've already recognized is that storing it in MySQL takes much more, maybe three times as
much space as with dbm. The indexes take a lot.


Kai
--
Kai Sch?tzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
Dhawal Doshy
2006-05-16 18:16:11 UTC
Permalink
Post by Kai Schaetzl
Post by Dhawal Doshy
precisely.. See,
http://wiki.mailscanner.info/doku.php?id=documentation:anti_spam:spamassassin:bayes:sql
Ah, thanks, seems I read the wrong wiki :-)
Post by Dhawal Doshy
mysql> SELECT id, username, spam_count, ham_count, token_count FROM
bayes_vars;
Seems to be the one that's also proposed in the wiki: root.
I'm still waiting that the --restore finishes, I've got quite a few tokens .... One caveat
I've already recognized is that storing it in MySQL takes much more, maybe three times as
much space as with dbm. The indexes take a lot.
Yes, but disk is cheap.. comparing MySQL (innodb) with DBM: scanning and
expiry are way faster, forgets are slower and learning is more or less
as fast/slow as for DBM.

See these for more details..
http://wiki.apache.org/spamassassin/BayesBenchmark
http://wiki.apache.org/spamassassin/BayesBenchmarkResults

Plus SQL will let you share Bayes across multiple front-end MX servers
and permission errors are a thing of the past..

- dhawal
Post by Kai Schaetzl
Kai
Kai Schaetzl
2006-05-16 19:51:24 UTC
Permalink
Post by Dhawal Doshy
Yes, but disk is cheap.. comparing MySQL (innodb) with DBM: scanning and
expiry are way faster, forgets are slower and learning is more or less
as fast/slow as for DBM.
Yeah, that's why I wanted to change.
Post by Dhawal Doshy
See these for more details..
http://wiki.apache.org/spamassassin/BayesBenchmark
http://wiki.apache.org/spamassassin/BayesBenchmarkResults
Plus SQL will let you share Bayes across multiple front-end MX servers
and permission errors are a thing of the past..
Sharing is only feasible for a few of my servers, but, yes, it's a bonus if
you need it.

As it seems you don't need the bayes_sql_override_username when you backup.
It's only needed when you read it in again. I used the backup.txt I had
done on another machine (with Bayes on dbm) without
bayes_sql_override_username and then restored it on the machine with the
testing setup and bayes_sql_override_username set. This took quite long
since the machine isn't the fastest and it were around 2 million tokens.
Bayes works.

Kai
--
Kai Sch?tzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
Loading...