Yes jerry, quite true.
stabuility... As said, this was done quite a few years ago.
place though... The dominant factor, vis-a-vis performance, is most likely
will be dealt with nicely by the SA results cache.
that gives me, when I get the time:-).
Post by Jerry Benton"The reason is very simple. When you insert a row into MyISAM, it just
puts it into the server's memory and hopes that the server will flush it to
disk at some point in the future. Good luck if the server crashes.
When you insert a row into InnoDB it syncs the transaction durably to
disk, and that requires it to wait for the disk to spin. Do the math on
your system and see how long that takes.
You can improve this by relaxing innodb_flush_log_at_trx_commit or by
batching rows within a transaction instead of doing one transaction per
row."
In short, myisam is faster for inserts but InnoDB is more reliable. All of
that ACID compliance and transaction rollback comes with an overhead cost.
InnoDB also provides row level locking instead of table level like myisam
and InnoDB can automatically recover from crashes. So, if you want
reliability over performance, go with InnoDB. If you want faster inserts
and quite often faster search results, go with MyISAM.
These are mail logs and not bank records. But I suppose the level of important is relative.
-
Jerry Benton
www.mailborder.com
Caveat: You should partition the database by time. This is the Mailborder
cp_maillog, which is slightly different than MailWatch, but the bit near
the end is what you are looking for. You can adapt it for your table with
an alter statement.
CREATE TABLE IF NOT EXISTS `cp_maillog` (
`db_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`timestamp` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`id` varchar(30) NOT NULL,
`size` bigint(20) DEFAULT '0',
`from_address` varchar(255) DEFAULT NULL,
`from_domain` varchar(255) DEFAULT NULL,
`to_address` varchar(255) DEFAULT NULL,
`to_domain` varchar(255) DEFAULT NULL,
`subject` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`clientip` varchar(15) DEFAULT NULL,
`archive` varchar(100) DEFAULT NULL,
`isspam` tinyint(1) DEFAULT '0',
`ishighspam` tinyint(1) DEFAULT '0',
`issaspam` tinyint(1) DEFAULT '0',
`isrblspam` tinyint(1) DEFAULT '0',
`spamwhitelisted` tinyint(1) DEFAULT '0',
`spamblacklisted` tinyint(1) DEFAULT '0',
`sascore` decimal(7,2) DEFAULT '0.00',
`spamreport` text,
`virusinfected` tinyint(1) DEFAULT '0',
`nameinfected` tinyint(1) DEFAULT '0',
`sizeinfected` tinyint(1) DEFAULT '0',
`otherinfected` tinyint(1) DEFAULT '0',
`report` text,
`ismcp` tinyint(1) DEFAULT '0',
`ishighmcp` tinyint(1) DEFAULT '0',
`issamcp` tinyint(1) DEFAULT '0',
`mcpwhitelisted` tinyint(1) DEFAULT '0',
`mcpblacklisted` tinyint(1) DEFAULT '0',
`mcpsascore` decimal(7,2) DEFAULT '0.00',
`mcpreport` text,
`hostname` varchar(100) DEFAULT NULL,
`date` date NOT NULL DEFAULT '0000-00-00',
`time` time DEFAULT NULL,
`headers` text,
`quarantined` tinyint(1) DEFAULT '0',
`released` tinyint(1) DEFAULT '0',
`guid` varchar(40) NOT NULL,
PRIMARY KEY (`db_id`,`date`),
KEY `id` (`id`),
KEY `timestamp` (`timestamp`),
KEY `from_address` (`from_address`),
KEY `from_domain` (`from_domain`),
KEY `to_address` (`to_address`),
KEY `to_domain` (`to_domain`),
KEY `guid` (`guid`),
KEY `isspam` (`isspam`),
KEY `ishighspam` (`ishighspam`),
KEY `issaspam` (`issaspam`),
KEY `isrblspam` (`isrblspam`),
KEY `spamwhitelisted` (`spamwhitelisted`),
KEY `spamblacklisted` (`spamblacklisted`),
KEY `virusinfected` (`virusinfected`),
KEY `nameinfected` (`nameinfected`),
KEY `otherinfected` (`otherinfected`),
KEY `quarantined` (`quarantined`),
KEY `sizeinfected` (`sizeinfected`),
KEY `ismcp` (`ismcp`),
KEY `ishighmcp` (`ishighmcp`),
KEY `issamcp` (`issamcp`),
KEY `mcpwhitelisted` (`mcpwhitelisted`),
KEY `mcpblacklisted` (`mcpblacklisted`),
KEY `released` (`released`),
KEY `size` (`size`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PARTITION BY HASH (( YEAR(`date`) +
MONTH(`date`) )) PARTITIONS 70;
-
Jerry Benton
www.mailborder.com
Based on Mailborder design and testing, which the DB structure of
Mailwatch is very similar, MyISAM has better performance when you start
hitting millions of records.
-
Jerry Benton
www.mailborder.com
Does converting the MailWatch databases to InnoDB make a big difference in
MailWatch performance?
Just curious.
Phil
*From:* mailscanner-bounces at lists.mailscanner.info [
mailto:mailscanner-bounces at lists.mailscanner.info
<mailscanner-bounces at lists.mailscanner.info>] *On Behalf Of *Glenn Steen
*Sent:* 05 August 2014 14:51
*To:* MailScanner discussion
*Subject:* Re: MailScanner Deficiency: Multi-Ruleset Processing per Email
Recipient
Can only agree with Martin and Alex, there is no way around either
splitting mails per recipient (very feasible), or som major rework of both
the MailScanner and mailWatch code (very infeasible).
But I also have to agree that the increase in hardware seem quite
excessive... i suppose you arrived at that figure by analysing the number
of recipients per mail (and frequency of multi-recipient emails)? Well, the
number isn?t everything:-)
Provided you use the normal caching-dns-thingy and also use "Cache
SpamAssassin Results = yes", the actual processing time and resource use
will be minimized (not to mention that the normal batch-processing style of
MailScanner will ... help...:-).
Introducing a "splitting MX" between the internet and your regular
MailScanner hosts should be rather simple, as well as adjusting which
Received: lines your MailScanner hosts should ignore (since they otherwise
will perceive all messages as originating from the "splitting MX" host)...
So why not try that, with the gear you have ATM, and see where that leads
you? Depending on what mailstore hosts you eventually deliver to, the
storage impact should be minimal or even non-existant, since even
M-Sexchange has abandioned "single store" since ... way back... so every
recipient would eventually have their own copy in their own mailbox
anyway;-).
As Alex says, we know nothing about your actual mail volume, but my money
is on there being much less of a problem than you think, even if you do
have ... serious traffic... (more than a few thousand mails/hour). the
likeliest problem point/bottleneck is likely your MailWatch database so...
keep an eye on that one, make sure you run it as InnoDB etc.
Cheers!
--
-- Glenn
Might want to also consider having a more flexible approach as Alex had mentioned.
Will also help with some of the hardware requirements as you can also
reject non-valid recipients at MTA as well as splitting the emails up, so
the core MailScanner farm has less to do.
--
Martin Hepworth, CISSP
Oxford, UK
Hi All,
We at SYNAQ use and have used Mailscanner for many years. As an Email
Hygiene provider MailScanner has served us very well.
However, as we have grown (very rapidly in the past 6 months, to many more
customer domains) we have noticed some deficiencies in MailScanner.
Overview
The issue has arisen due to SYNAQ's ever growing client base and the fact
that we're provisioning more and more customers (and email domains) on our
hygiene platform, and that more than one of these customer
recipients/domains (and their applicable rulesets) are being addressed in
the same email.
Problem 1
1) abc.co.za and xyz.co.za are both provisioned on our platform.
2) abc.co.za has quarantining of SPAM configured, while xyz.co.za does not.
3) Mailscanner accepts the message for processing but "chooses"
user at abc.co.za andabc.co.za as the Message's "to_address" and "to_domain".
4) MailScanner determines that the message is SPAM and because it has
5) However the rule for xyz.co.za is to store/quarantine spam. This does
not happen because of the actions above and data is also never logged via
MailWatch.
6) The example above is a based on very simple scenario, and as you are
aware this applies to many more complex rulesets (size, File Type etc)
across the system.
Problem 2
1) abc.co.za and xyz.co.za are both provisioned on our platform.
2) A third party emails both user at abc.co.za and user at xyz.co.za in a
single email message.
3) Mailscanner accepts the message for processing but "chooses"
user at abc.co.za andabc.co.za as the Message's "to_address" and "to_domain".
4) When the message is processed, the MailWatch.pm script receives a
message object for SQL logging with data only for user at abc.co.za and
abc.co.za; xyz.co.za is never logged.
Finally we have considered splitting incoming messages by recipient at an
MTA level to address this problem, but our calculations show that it would
require 3.5x more hardware to process this increased mail load. So for us a
MailsScanner solution is ideal.
Based on the above, could you tell me if there is anything that can be
done from a MailScanner community point of view to help develop MailScanner
functionality to address these issues?
We'd be very happy to give a nice donation for a fix or patch.
Also if the community has any ideas on other ways we can remedy this
problem we welcome your feedback.
Thanks and regards,
Sam Gelbart
SYNAQ
--
MailScanner mailing list
mailscanner at lists.mailscanner.info
http://lists.mailscanner.info/mailman/listinfo/mailscanner
Before posting, read http://wiki.mailscanner.info/posting
Support MailScanner development - buy the book off the website!
--
MailScanner mailing list
mailscanner at lists.mailscanner.info
http://lists.mailscanner.info/mailman/listinfo/mailscanner
Before posting, read http://wiki.mailscanner.info/posting
Support MailScanner development - buy the book off the website!
--
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se
Hoople Ltd, Registered in England and Wales No. 7556595
Registered office: Plough Lane, Hereford, HR4 0LE
"Any opinion expressed in this e-mail or any attached files are those of
the individual and not necessarily those of Hoople Ltd. You should be aware
that Hoople Ltd. monitors its email service. This e-mail and any attached
files are confidential and intended solely for the use of the addressee.
This communication may contain material protected by law from being passed
on. If you are not the intended recipient and have received this e-mail in
error, you are advised that any use, dissemination, forwarding, printing or
copying of this e-mail is strictly prohibited. If you have received this
e-mail in error please contact the sender immediately and destroy all
copies of it." --
MailScanner mailing list
mailscanner at lists.mailscanner.info
http://lists.mailscanner.info/mailman/listinfo/mailscanner
Before posting, read http://wiki.mailscanner.info/posting
Support MailScanner development - buy the book off the website!
--
MailScanner mailing list
mailscanner at lists.mailscanner.info
http://lists.mailscanner.info/mailman/listinfo/mailscanner
Before posting, read http://wiki.mailscanner.info/posting
Support MailScanner development - buy the book off the website!