add ansible role journal-postfix (a log parser for Postfix) with playbook and doc
This commit is contained in:
		
							parent
							
								
									713372c850
								
							
						
					
					
						commit
						e5a8025064
					
				
					 14 changed files with 3570 additions and 0 deletions
				
			
		
							
								
								
									
										378
									
								
								journal-postfix-doc/20191127_pyugat_talk.html
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										378
									
								
								journal-postfix-doc/20191127_pyugat_talk.html
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,378 @@ | |||
| <h1>journal-postfix - A log parser for Postfix</h1> | ||||
| <p>Experiences from applying Python to the domain of bad old email.</p> | ||||
| <h2>Email ✉</h2> | ||||
| <ul> | ||||
| <li>old technology (starting in the 70ies)</li> | ||||
| <li><a href="https://en.wikipedia.org/wiki/Store_and_forward">store-and-forward</a>: sent != delivered to recipient</li> | ||||
| <li>non-delivery reasons: | ||||
| <ul> | ||||
| <li>recipient over quota</li> | ||||
| <li>inexistent destination</li> | ||||
| <li>malware</li> | ||||
| <li>spam</li> | ||||
| <li>server problem</li> | ||||
| <li>...</li> | ||||
| </ul></li> | ||||
| <li>permanent / non-permanent failure (<a href="https://www.iana.org/assignments/smtp-enhanced-status-codes/smtp-enhanced-status-codes.xhtml">DSN ~ 5.X.Y / 4.X.Y</a>)</li> | ||||
| <li>non-delivery modes | ||||
| <ul> | ||||
| <li>immediate reject on SMTP level</li> | ||||
| <li>delayed <a href="https://en.wikipedia.org/wiki/Bounce_message">bounce messages</a> by <a href="https://upload.wikimedia.org/wikipedia/commons/a/a2/Bounce-DSN-MTA-names.png">reporting MTA</a> - queueing (e.g., ~5d) before delivery failure notification</li> | ||||
| <li>discarding</li> | ||||
| </ul></li> | ||||
| <li>read receipts</li> | ||||
| <li><a href="https://en.wikipedia.org/wiki/Email_tracking">Wikipedia: email tracking</a></li> | ||||
| </ul> | ||||
| <h2><a href="https://en.wikipedia.org/wiki/SMTP">SMTP</a></h2> | ||||
| <p><a href="https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol#SMTP_transport_example">SMTP session example</a>: envelope sender, envelope recipient may differ from From:, To:</p> | ||||
| <p>Lists of error codes:</p> | ||||
| <ul> | ||||
| <li><a href="https://www.inmotionhosting.com/support/email/email-troubleshooting/smtp-and-esmtp-error-code-list">SMTP and ESMTP</a></li> | ||||
| <li><a href="https://serversmtp.com/smtp-error/">SMTP</a></li> | ||||
| <li><a href="https://info.webtoolhub.com/kb-a15-smtp-status-codes-smtp-error-codes-smtp-reply-codes.aspx">SMTP</a></li> | ||||
| </ul> | ||||
| <p>Example of an error within a bounced email (Subject: Mail delivery failed: returning message to sender)</p> | ||||
| <pre><code>SMTP error from remote server for TEXT command, host: smtpin.rzone.de (81.169.145.97) reason: 550 5.7.1 Refused by local policy. No SPAM please! | ||||
| </code></pre> | ||||
| <ul> | ||||
| <li>email users are continually asking for the fate of their emails (or those of their correspondents which should have arrived)</li> | ||||
| </ul> | ||||
| <h2><a href="http://www.postfix.org">Postfix</a></h2> | ||||
| <ul> | ||||
| <li>popular <a href="https://en.wikipedia.org/wiki/Message_transfer_agent">MTA</a></li> | ||||
| <li>written in C</li> | ||||
| <li>logging to files / journald</li> | ||||
| <li>example log messages for a (non-)delivery + stats</li> | ||||
| </ul> | ||||
| <pre><code>Nov 27 16:19:22 mail postfix/smtpd[18995]: connect from unknown[80.82.79.244] | ||||
| Nov 27 16:19:22 mail postfix/smtpd[18995]: NOQUEUE: reject: RCPT from unknown[80.82.79.244]: 454 4.7.1 <spameri@tiscali.it>: Relay access denied; from=<spameri@tiscali.it> to=<spameri@tiscali.it> proto=ESMTP helo=<WIN-G7CPHCGK247> | ||||
| Nov 27 16:19:22 mail postfix/smtpd[18995]: disconnect from unknown[80.82.79.244] ehlo=1 mail=1 rcpt=0/1 rset=1 quit=1 commands=4/5 | ||||
| 
 | ||||
| Nov 27 16:22:43 mail postfix/anvil[18997]: statistics: max connection rate 1/60s for (smtp:80.82.79.244) at Nov 27 16:19:22 | ||||
| Nov 27 16:22:43 mail postfix/anvil[18997]: statistics: max connection count 1 for (smtp:80.82.79.244) at Nov 27 16:19:22 | ||||
| Nov 27 16:22:43 mail postfix/anvil[18997]: statistics: max cache size 1 at Nov 27 16:19:22 | ||||
| 
 | ||||
| Nov 27 16:22:48 mail postfix/smtpd[18999]: connect from mail.cosmopool.net[2a01:4f8:160:20c1::10:107] | ||||
| Nov 27 16:22:49 mail postfix/smtpd[18999]: 47NQzY13DbzNWNQG: client=mail.cosmopool.net[2a01:4f8:160:20c1::10:107] | ||||
| Nov 27 16:22:49 mail postfix/cleanup[19003]: 47NQzY13DbzNWNQG: info: header Subject: Re: test from mail.cosmopool.net[2a01:4f8:160:20c1::10:107]; from=<ibu@cosmopool.net> to=<ibu@multiname.org> proto=ESMTP helo=<mail.cosmopool.net> | ||||
| Nov 27 16:22:49 mail postfix/cleanup[19003]: 47NQzY13DbzNWNQG: message-id=<d5154432-b984-d65a-30b3-38bde7e37af8@cosmopool.net> | ||||
| Nov 27 16:22:49 mail postfix/qmgr[29349]: 47NQzY13DbzNWNQG: from=<ibu@cosmopool.net>, size=1365, nrcpt=2 (queue active) | ||||
| Nov 27 16:22:49 mail postfix/smtpd[18999]: disconnect from mail.cosmopool.net[2a01:4f8:160:20c1::10:107] ehlo=1 mail=1 rcpt=2 data=1 quit=1 commands=6 | ||||
| Nov 27 16:22:50 mail postfix/lmtp[19005]: 47NQzY13DbzNWNQG: to=<ibu2@multiname.org>, relay=mail.multiname.org[private/dovecot-lmtp], delay=1.2, delays=0.56/0.01/0.01/0.63, dsn=2.0.0, status=sent (250 2.0.0 <ibu2@multiname.org> nV9iJ9mi3l0+SgAAZU03Dg Saved) | ||||
| Nov 27 16:22:50 mail postfix/lmtp[19005]: 47NQzY13DbzNWNQG: to=<ibu@multiname.org>, relay=mail.multiname.org[private/dovecot-lmtp], delay=1.2, delays=0.56/0.01/0.01/0.63, dsn=2.0.0, status=sent (250 2.0.0 <ibu@multiname.org> nV9iJ9mi3l0+SgAAZU03Dg:2 Saved) | ||||
| Nov 27 16:22:50 mail postfix/qmgr[29349]: 47NQzY13DbzNWNQG: removed | ||||
| </code></pre> | ||||
| <ul> | ||||
| <li><a href="http://www.postfix.org/OVERVIEW.html">involved postfix components</a> | ||||
| <ul> | ||||
| <li>smtpd (port 25: smtp, port 587: submission)</li> | ||||
| <li>cleanup</li> | ||||
| <li>smtp/lmtp</li> | ||||
| </ul></li> | ||||
| <li>missing log parser</li> | ||||
| </ul> | ||||
| <h2>Idea</h2> | ||||
| <ul> | ||||
| <li>follow log stream and write summarized delivery information to a database</li> | ||||
| <li>goal: spot delivery problems, collect delivery stats</li> | ||||
| <li>a GUI could then display the current delivery status to users</li> | ||||
| </ul> | ||||
| <h2>Why Python?</h2> | ||||
| <ul> | ||||
| <li>simple and fun language, clear and concise</li> | ||||
| <li>well suited for text processing</li> | ||||
| <li>libs available for systemd, PostgreSQL</li> | ||||
| <li>huge standard library (used here: datetime, re, yaml, argparse, select)</li> | ||||
| <li>speed sufficient?</li> | ||||
| </ul> | ||||
| <h2>Development iterations</h2> | ||||
| <ul> | ||||
| <li>hmm, easy task, might take a few days</li> | ||||
| <li>PoC: reading and polling from journal works as expected</li> | ||||
| <li>used postfix logfiles in syslog format and wrote regexps matching them iteratively</li> | ||||
| <li>separated parsing messages from extracting delivery information</li> | ||||
| <li>created a delivery table</li> | ||||
| <li>hmm, this is very slow, takes hours to process log messages from a few days (from a server with not much traffic)</li> | ||||
| <li>introduced polling timeout and SQL transactions handling several messages at once</li> | ||||
| <li>... much faster</li> | ||||
| <li>looks fine, but wait... did I catch all syntax variants of Postfix log messages?</li> | ||||
| <li>looked into Postfix sources and almost got lost</li> | ||||
| <li>weeks of hard work identifying relevant log output directives</li> | ||||
| <li>completely rewrote parser to deal with the rich log msg syntax, e.g.:<br> <code>def _strip_pattern(msg, res, pattern_name, pos='l', target_names=None)</code></li> | ||||
| <li>oh, there are even more Postfix components... limit to certain Postfix configurations, in particular virtual mailboxes and not local ones</li> | ||||
| <li>mails may have multiple recipients... split delivery table into delivery_from and delivery_to</li> | ||||
| <li>decide which delivery information is relevant</li> | ||||
| <li>cleanup and polish (config mgmt, logging)</li> | ||||
| <li>write ansible role</li> | ||||
| </ul> | ||||
| <h2>Structure</h2> | ||||
| <svg viewBox="0 0 1216 400" xmlns="http://www.w3.org/2000/svg" xmlns:inkspace="http://www.inkscape.org/namespaces/inkscape" xmlns:xlink="http://www.w3.org/1999/xlink"> | ||||
|   <defs id="defs_block"> | ||||
|     <filter height="1.504" id="filter_blur" inkspace:collect="always" width="1.1575" x="-0.07875" y="-0.252"> | ||||
|       <feGaussianBlur id="feGaussianBlur3780" inkspace:collect="always" stdDeviation="4.2" /> | ||||
|     </filter> | ||||
|   </defs> | ||||
|   <title>blockdiag</title> | ||||
|   <desc>blockdiag { | ||||
|     default_fontsize = 20; | ||||
|     node_height = 80; | ||||
|     journal_since -> run_loop; | ||||
|     journal_follow -> run_loop; | ||||
|     logfile -> run_loop; | ||||
|     run_loop -> parse -> extract_delivery -> store; | ||||
|     store -> delivery_from; | ||||
|     store -> delivery_to; | ||||
|     store -> noqueue; | ||||
| 
 | ||||
|     group { label="input iterables"; journal_since; journal_follow; logfile; }; | ||||
|     group { label="output tables"; delivery_from; delivery_to; noqueue; }; | ||||
| } | ||||
| </desc> | ||||
|   <rect fill="rgb(243,152,0)" height="340" style="filter:url(#filter_blur)" width="144" x="56" y="30" /> | ||||
|   <rect fill="rgb(243,152,0)" height="340" style="filter:url(#filter_blur)" width="144" x="1016" y="30" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="259" y="46" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="67" y="46" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="67" y="166" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="67" y="286" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="451" y="46" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="643" y="46" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="835" y="46" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="1027" y="46" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="1027" y="166" /> | ||||
|   <rect fill="rgb(0,0,0)" height="80" stroke="rgb(0,0,0)" style="filter:url(#filter_blur);opacity:0.7;fill-opacity:1" width="128" x="1027" y="286" /> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="256" y="40" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="87" x="320.5" y="90">run_loop</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="64" y="40" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="120" x="128.0" y="79">journal_sin</text> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="21" x="128.5" y="101">ce</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="64" y="160" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="120" x="128.0" y="199">journal_fol</text> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="32" x="128.0" y="221">low</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="64" y="280" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="76" x="128.0" y="330">logfile</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="448" y="40" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="54" x="512.0" y="90">parse</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="640" y="40" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="120" x="704.0" y="79">extract_del</text> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="54" x="704.0" y="101">ivery</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="832" y="40" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="54" x="896.0" y="90">store</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="1024" y="40" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="120" x="1088.0" y="79">delivery_fr</text> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="21" x="1088.5" y="101">om</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="1024" y="160" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="120" x="1088.0" y="210">delivery_to</text> | ||||
|   <rect fill="rgb(255,255,255)" height="80" stroke="rgb(0,0,0)" width="128" x="1024" y="280" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="20" font-style="normal" font-weight="normal" text-anchor="middle" textLength="76" x="1088.0" y="330">noqueue</text> | ||||
|   <path d="M 384 80 L 440 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="447,80 440,76 440,84 447,80" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 576 80 L 632 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="639,80 632,76 632,84 639,80" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 768 80 L 824 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="831,80 824,76 824,84 831,80" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 960 80 L 1016 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="1023,80 1016,76 1016,84 1023,80" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 960 80 L 992 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 992 80 L 992 200" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 992 200 L 1016 200" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="1023,200 1016,196 1016,204 1023,200" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 960 80 L 992 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 992 80 L 992 320" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 992 320 L 1016 320" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="1023,320 1016,316 1016,324 1023,320" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 192 80 L 248 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="255,80 248,76 248,84 255,80" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 192 200 L 240 200" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 240 200 L 240 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 240 80 L 248 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="255,80 248,76 248,84 255,80" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 192 320 L 240 320" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 240 320 L 240 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <path d="M 240 80 L 248 80" fill="none" stroke="rgb(0,0,0)" /> | ||||
|   <polygon fill="rgb(0,0,0)" points="255,80 248,76 248,84 255,80" stroke="rgb(0,0,0)" /> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="16" font-style="normal" font-weight="normal" text-anchor="middle" textLength="122" x="128.0" y="38">input iter ...</text> | ||||
|   <text fill="rgb(0,0,0)" font-family="sans-serif" font-size="16" font-style="normal" font-weight="normal" text-anchor="middle" textLength="113" x="1088.5" y="38">output tables</text> | ||||
| </svg> | ||||
| <h2>Iterables</h2> | ||||
| <div class="sourceCode" id="cb3"><pre class="sourceCode python"><code class="sourceCode python"><a class="sourceLine" id="cb3-1" title="1"><span class="kw">def</span> iter_journal_messages_since(timestamp: Union[<span class="bu">int</span>, <span class="bu">float</span>]):</a> | ||||
| <a class="sourceLine" id="cb3-2" title="2">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb3-3" title="3"><span class="co">    Yield False and message details from the journal since *timestamp*.</span></a> | ||||
| <a class="sourceLine" id="cb3-4" title="4"></a> | ||||
| <a class="sourceLine" id="cb3-5" title="5"><span class="co">    This is the loading phase (loading messages that already existed</span></a> | ||||
| <a class="sourceLine" id="cb3-6" title="6"><span class="co">    when we start).</span></a> | ||||
| <a class="sourceLine" id="cb3-7" title="7"></a> | ||||
| <a class="sourceLine" id="cb3-8" title="8"><span class="co">    Argument *timestamp* is a UNIX timestamp.</span></a> | ||||
| <a class="sourceLine" id="cb3-9" title="9"></a> | ||||
| <a class="sourceLine" id="cb3-10" title="10"><span class="co">    Only journal entries for systemd unit UNITNAME with loglevel</span></a> | ||||
| <a class="sourceLine" id="cb3-11" title="11"><span class="co">    INFO and above are retrieved.</span></a> | ||||
| <a class="sourceLine" id="cb3-12" title="12"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb3-13" title="13">    ...</a> | ||||
| <a class="sourceLine" id="cb3-14" title="14"></a> | ||||
| <a class="sourceLine" id="cb3-15" title="15"><span class="kw">def</span> iter_journal_messages_follow(timestamp: Union[<span class="bu">int</span>, <span class="bu">float</span>]):</a> | ||||
| <a class="sourceLine" id="cb3-16" title="16">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb3-17" title="17"><span class="co">    Yield commit and message details from the journal through polling.</span></a> | ||||
| <a class="sourceLine" id="cb3-18" title="18"></a> | ||||
| <a class="sourceLine" id="cb3-19" title="19"><span class="co">    This is the polling phase (after we have read pre-existing messages</span></a> | ||||
| <a class="sourceLine" id="cb3-20" title="20"><span class="co">    in the loading phase).</span></a> | ||||
| <a class="sourceLine" id="cb3-21" title="21"></a> | ||||
| <a class="sourceLine" id="cb3-22" title="22"><span class="co">    Argument *timestamp* is a UNIX timestamp.</span></a> | ||||
| <a class="sourceLine" id="cb3-23" title="23"></a> | ||||
| <a class="sourceLine" id="cb3-24" title="24"><span class="co">    Only journal entries for systemd unit UNITNAME with loglevel</span></a> | ||||
| <a class="sourceLine" id="cb3-25" title="25"><span class="co">    INFO and above are retrieved.</span></a> | ||||
| <a class="sourceLine" id="cb3-26" title="26"></a> | ||||
| <a class="sourceLine" id="cb3-27" title="27"><span class="co">    *commit* (bool) tells whether it is time to store the delivery</span></a> | ||||
| <a class="sourceLine" id="cb3-28" title="28"><span class="co">    information obtained from the messages yielded by us.</span></a> | ||||
| <a class="sourceLine" id="cb3-29" title="29"><span class="co">    It is set to True if max_delay_before_commit has elapsed.</span></a> | ||||
| <a class="sourceLine" id="cb3-30" title="30"><span class="co">    After this delay delivery information will be written; to be exact:</span></a> | ||||
| <a class="sourceLine" id="cb3-31" title="31"><span class="co">    the delay may increase by up to one journal_poll_interval.</span></a> | ||||
| <a class="sourceLine" id="cb3-32" title="32"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb3-33" title="33">    ...</a> | ||||
| <a class="sourceLine" id="cb3-34" title="34"></a> | ||||
| <a class="sourceLine" id="cb3-35" title="35"><span class="kw">def</span> iter_logfile_messages(filepath: <span class="bu">str</span>, year: <span class="bu">int</span>,</a> | ||||
| <a class="sourceLine" id="cb3-36" title="36">                          commit_after_lines<span class="op">=</span>max_messages_per_commit):</a> | ||||
| <a class="sourceLine" id="cb3-37" title="37">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb3-38" title="38"><span class="co">    Yield messages and a commit flag from a logfile.</span></a> | ||||
| <a class="sourceLine" id="cb3-39" title="39"></a> | ||||
| <a class="sourceLine" id="cb3-40" title="40"><span class="co">    Loop through all lines of the file with given *filepath* and</span></a> | ||||
| <a class="sourceLine" id="cb3-41" title="41"><span class="co">    extract the time and log message. If the log message starts</span></a> | ||||
| <a class="sourceLine" id="cb3-42" title="42"><span class="co">    with 'postfix/', then extract the syslog_identifier, pid and</span></a> | ||||
| <a class="sourceLine" id="cb3-43" title="43"><span class="co">    message text.</span></a> | ||||
| <a class="sourceLine" id="cb3-44" title="44"></a> | ||||
| <a class="sourceLine" id="cb3-45" title="45"><span class="co">    Since syslog lines do not contain the year, the *year* to which</span></a> | ||||
| <a class="sourceLine" id="cb3-46" title="46"><span class="co">    the first log line belongs must be given.</span></a> | ||||
| <a class="sourceLine" id="cb3-47" title="47"></a> | ||||
| <a class="sourceLine" id="cb3-48" title="48"><span class="co">    Return a commit flag and a dict with these keys:</span></a> | ||||
| <a class="sourceLine" id="cb3-49" title="49"><span class="co">        't': timestamp</span></a> | ||||
| <a class="sourceLine" id="cb3-50" title="50"><span class="co">        'message': message text</span></a> | ||||
| <a class="sourceLine" id="cb3-51" title="51"><span class="co">        'identifier': syslog identifier (e.g., 'postfix/smtpd')</span></a> | ||||
| <a class="sourceLine" id="cb3-52" title="52"><span class="co">        'pid': process id</span></a> | ||||
| <a class="sourceLine" id="cb3-53" title="53"></a> | ||||
| <a class="sourceLine" id="cb3-54" title="54"><span class="co">    The commit flag will be set to True for every</span></a> | ||||
| <a class="sourceLine" id="cb3-55" title="55"><span class="co">    (commit_after_lines)-th filtered message and serves</span></a> | ||||
| <a class="sourceLine" id="cb3-56" title="56"><span class="co">    as a signal to the caller to commit this chunk of data</span></a> | ||||
| <a class="sourceLine" id="cb3-57" title="57"><span class="co">    to the database.</span></a> | ||||
| <a class="sourceLine" id="cb3-58" title="58"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb3-59" title="59">    ...</a></code></pre></div> | ||||
| <h2>Running loops</h2> | ||||
| <div class="sourceCode" id="cb4"><pre class="sourceCode python"><code class="sourceCode python"><a class="sourceLine" id="cb4-1" title="1"><span class="kw">def</span> run(dsn, verp_marker<span class="op">=</span><span class="va">False</span>, filepath<span class="op">=</span><span class="va">None</span>, year<span class="op">=</span><span class="va">None</span>, debug<span class="op">=</span>[]):</a> | ||||
| <a class="sourceLine" id="cb4-2" title="2">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb4-3" title="3"><span class="co">    Determine loop(s) and run them within a database context.</span></a> | ||||
| <a class="sourceLine" id="cb4-4" title="4"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb4-5" title="5">    init(verp_marker<span class="op">=</span>verp_marker)</a> | ||||
| <a class="sourceLine" id="cb4-6" title="6">    <span class="cf">with</span> psycopg2.<span class="ex">connect</span>(dsn) <span class="im">as</span> conn:</a> | ||||
| <a class="sourceLine" id="cb4-7" title="7">        <span class="cf">with</span> conn.cursor(cursor_factory<span class="op">=</span>psycopg2.extras.RealDictCursor) <span class="im">as</span> curs:</a> | ||||
| <a class="sourceLine" id="cb4-8" title="8">            <span class="cf">if</span> filepath:</a> | ||||
| <a class="sourceLine" id="cb4-9" title="9">                run_loop(iter_logfile_messages(filepath, year), curs, debug<span class="op">=</span>debug)</a> | ||||
| <a class="sourceLine" id="cb4-10" title="10">            <span class="cf">else</span>:</a> | ||||
| <a class="sourceLine" id="cb4-11" title="11">                begin_timestamp <span class="op">=</span> get_latest_timestamp(curs)</a> | ||||
| <a class="sourceLine" id="cb4-12" title="12">                run_loop(iter_journal_messages_since(begin_timestamp), curs, debug<span class="op">=</span>debug)</a> | ||||
| <a class="sourceLine" id="cb4-13" title="13">                begin_timestamp <span class="op">=</span> get_latest_timestamp(curs)</a> | ||||
| <a class="sourceLine" id="cb4-14" title="14">                run_loop(iter_journal_messages_follow(begin_timestamp), curs, debug<span class="op">=</span>debug)</a> | ||||
| <a class="sourceLine" id="cb4-15" title="15"></a> | ||||
| <a class="sourceLine" id="cb4-16" title="16"><span class="kw">def</span> run_loop(iterable, curs, debug<span class="op">=</span>[]):</a> | ||||
| <a class="sourceLine" id="cb4-17" title="17">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb4-18" title="18"><span class="co">    Loop over log messages obtained from *iterable*.</span></a> | ||||
| <a class="sourceLine" id="cb4-19" title="19"></a> | ||||
| <a class="sourceLine" id="cb4-20" title="20"><span class="co">    Parse the message, extract delivery information from it and store</span></a> | ||||
| <a class="sourceLine" id="cb4-21" title="21"><span class="co">    that delivery information.</span></a> | ||||
| <a class="sourceLine" id="cb4-22" title="22"></a> | ||||
| <a class="sourceLine" id="cb4-23" title="23"><span class="co">    For performance reasons delivery items are collected in a cache</span></a> | ||||
| <a class="sourceLine" id="cb4-24" title="24"><span class="co">    before writing them (i.e., committing a database transaction).</span></a> | ||||
| <a class="sourceLine" id="cb4-25" title="25"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb4-26" title="26">    cache <span class="op">=</span> []</a> | ||||
| <a class="sourceLine" id="cb4-27" title="27">    msg_count <span class="op">=</span> max_messages_per_commit</a> | ||||
| <a class="sourceLine" id="cb4-28" title="28">    <span class="cf">for</span> commit, msg_details <span class="kw">in</span> iterable:</a> | ||||
| <a class="sourceLine" id="cb4-29" title="29">        ...</a></code></pre></div> | ||||
| <h2>Parsing</h2> | ||||
| <p>Parse what you can. (But only msg_info in Postfix, and only relevant components.)</p> | ||||
| <div class="sourceCode" id="cb5"><pre class="sourceCode python"><code class="sourceCode python"><a class="sourceLine" id="cb5-1" title="1"><span class="kw">def</span> parse(msg_details, debug<span class="op">=</span><span class="va">False</span>):</a> | ||||
| <a class="sourceLine" id="cb5-2" title="2">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb5-3" title="3"><span class="co">    Parse a log message returning a dict.</span></a> | ||||
| <a class="sourceLine" id="cb5-4" title="4"></a> | ||||
| <a class="sourceLine" id="cb5-5" title="5"><span class="co">    *msg_details* is assumed to be a dict with these keys:</span></a> | ||||
| <a class="sourceLine" id="cb5-6" title="6"></a> | ||||
| <a class="sourceLine" id="cb5-7" title="7"><span class="co">      * 'identifier' (syslog identifier),</span></a> | ||||
| <a class="sourceLine" id="cb5-8" title="8"><span class="co">      * 'pid' (process id),</span></a> | ||||
| <a class="sourceLine" id="cb5-9" title="9"><span class="co">      * 'message' (message text)</span></a> | ||||
| <a class="sourceLine" id="cb5-10" title="10"></a> | ||||
| <a class="sourceLine" id="cb5-11" title="11"><span class="co">    The syslog identifier and process id are copied to the resulting dict.</span></a> | ||||
| <a class="sourceLine" id="cb5-12" title="12"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb5-13" title="13">    ...</a> | ||||
| <a class="sourceLine" id="cb5-14" title="14"></a> | ||||
| <a class="sourceLine" id="cb5-15" title="15"><span class="kw">def</span> _parse_branch(comp, msg, res):</a> | ||||
| <a class="sourceLine" id="cb5-16" title="16">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb5-17" title="17"><span class="co">    Parse a log message string *msg*, adding results to dict *res*.</span></a> | ||||
| <a class="sourceLine" id="cb5-18" title="18"></a> | ||||
| <a class="sourceLine" id="cb5-19" title="19"><span class="co">    Depending on the component *comp* we branch to functions</span></a> | ||||
| <a class="sourceLine" id="cb5-20" title="20"><span class="co">    named _parse_{comp}.</span></a> | ||||
| <a class="sourceLine" id="cb5-21" title="21"></a> | ||||
| <a class="sourceLine" id="cb5-22" title="22"><span class="co">    Add parsing results to dict *res*. Always add key 'action'.</span></a> | ||||
| <a class="sourceLine" id="cb5-23" title="23"><span class="co">    Try to parse every syntactical element.</span></a> | ||||
| <a class="sourceLine" id="cb5-24" title="24"><span class="co">    Note: We parse what we can. Assessment of parsing results relevant</span></a> | ||||
| <a class="sourceLine" id="cb5-25" title="25"><span class="co">    for delivery is done in :func:`extract_delivery`.</span></a> | ||||
| <a class="sourceLine" id="cb5-26" title="26"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb5-27" title="27">    ...</a></code></pre></div> | ||||
| <h2>Extracting</h2> | ||||
| <p>Extract what is relevant.</p> | ||||
| <div class="sourceCode" id="cb6"><pre class="sourceCode python"><code class="sourceCode python"><a class="sourceLine" id="cb6-1" title="1"><span class="kw">def</span> extract_delivery(msg_details, parsed):</a> | ||||
| <a class="sourceLine" id="cb6-2" title="2">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb6-3" title="3"><span class="co">    Compute delivery information from parsing results.</span></a> | ||||
| <a class="sourceLine" id="cb6-4" title="4"></a> | ||||
| <a class="sourceLine" id="cb6-5" title="5"><span class="co">    Basically this means that we map the parsed fields to</span></a> | ||||
| <a class="sourceLine" id="cb6-6" title="6"><span class="co">    a type ('from' or 'to') and to the database</span></a> | ||||
| <a class="sourceLine" id="cb6-7" title="7"><span class="co">    fields for table 'delivery_from' or 'delivery_to'.</span></a> | ||||
| <a class="sourceLine" id="cb6-8" title="8"></a> | ||||
| <a class="sourceLine" id="cb6-9" title="9"><span class="co">    We branch to functions _extract_{comp} where comp is the</span></a> | ||||
| <a class="sourceLine" id="cb6-10" title="10"><span class="co">    name of a Postfix component.</span></a> | ||||
| <a class="sourceLine" id="cb6-11" title="11"></a> | ||||
| <a class="sourceLine" id="cb6-12" title="12"><span class="co">    Return a list of error strings and a dict with the</span></a> | ||||
| <a class="sourceLine" id="cb6-13" title="13"><span class="co">    extracted information. Keys with None values are removed</span></a> | ||||
| <a class="sourceLine" id="cb6-14" title="14"><span class="co">    from the dict.</span></a> | ||||
| <a class="sourceLine" id="cb6-15" title="15"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb6-16" title="16">    ...</a></code></pre></div> | ||||
| <h2>Regular expressions</h2> | ||||
| <ul> | ||||
| <li><p>see sources</p></li> | ||||
| <li><p><a href="https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression">Stackoverflow: How to validate an email address</a> <a href="https://i.stack.imgur.com/YI6KR.png">FSM</a></p></li> | ||||
| </ul> | ||||
| <h3>BTW: <a href="https://docs.python.org/3/library/email.utils.html#email.utils.parseaddr">email.utils.parseaddr</a></h3> | ||||
| <div class="sourceCode" id="cb7"><pre class="sourceCode python"><code class="sourceCode python"><a class="sourceLine" id="cb7-1" title="1"><span class="op">>>></span> <span class="im">from</span> email.utils <span class="im">import</span> parseaddr</a> | ||||
| <a class="sourceLine" id="cb7-2" title="2"><span class="op">>>></span> parseaddr(<span class="st">'Ghost <"hello@nowhere"@pyug.at>'</span>)</a> | ||||
| <a class="sourceLine" id="cb7-3" title="3">(<span class="st">'Ghost'</span>, <span class="st">'"hello@nowhere"@pyug.at'</span>)</a> | ||||
| <a class="sourceLine" id="cb7-4" title="4"><span class="op">>>></span> <span class="bu">print</span>(parseaddr(<span class="st">'"more</span><span class="ch">\"</span><span class="st">fun</span><span class="ch">\"\\</span><span class="st">"hello</span><span class="ch">\\</span><span class="st">"@nowhere"@pyug.at'</span>)[<span class="dv">1</span>])</a> | ||||
| <a class="sourceLine" id="cb7-5" title="5"><span class="co">"more"</span>fun<span class="st">"</span><span class="ch">\"</span><span class="st">hello</span><span class="ch">\"</span><span class="st">@nowhere"</span><span class="op">@</span>pyug.at</a> | ||||
| <a class="sourceLine" id="cb7-6" title="6"><span class="op">>>></span> <span class="bu">print</span>(parseaddr(<span class="st">'""@pyug.at'</span>)[<span class="dv">1</span>])</a> | ||||
| <a class="sourceLine" id="cb7-7" title="7"><span class="co">""</span><span class="op">@</span>pyug.at</a></code></pre></div> | ||||
| <h2>Storing</h2> | ||||
| <div class="sourceCode" id="cb8"><pre class="sourceCode python"><code class="sourceCode python"><a class="sourceLine" id="cb8-1" title="1"><span class="kw">def</span> store_deliveries(cursor, cache, debug<span class="op">=</span>[]):</a> | ||||
| <a class="sourceLine" id="cb8-2" title="2">    <span class="co">"""</span></a> | ||||
| <a class="sourceLine" id="cb8-3" title="3"><span class="co">    Store cached delivery information into the database.</span></a> | ||||
| <a class="sourceLine" id="cb8-4" title="4"></a> | ||||
| <a class="sourceLine" id="cb8-5" title="5"><span class="co">    Find queue_ids in *cache* and group delivery items by</span></a> | ||||
| <a class="sourceLine" id="cb8-6" title="6"><span class="co">    them, but separately for delivery types 'from' and 'to'.</span></a> | ||||
| <a class="sourceLine" id="cb8-7" title="7"><span class="co">    In addition, collect delivery items with queue_id is None.</span></a> | ||||
| <a class="sourceLine" id="cb8-8" title="8"></a> | ||||
| <a class="sourceLine" id="cb8-9" title="9"><span class="co">    After grouping we merge all items withing a group into a</span></a> | ||||
| <a class="sourceLine" id="cb8-10" title="10"><span class="co">    single item. So we can combine several SQL queries into </span></a> | ||||
| <a class="sourceLine" id="cb8-11" title="11"><span class="co">    a single one, which improves performance significantly.</span></a> | ||||
| <a class="sourceLine" id="cb8-12" title="12"></a> | ||||
| <a class="sourceLine" id="cb8-13" title="13"><span class="co">    Then store the merged items and the deliveries with</span></a> | ||||
| <a class="sourceLine" id="cb8-14" title="14"><span class="co">    queue_id is None.</span></a> | ||||
| <a class="sourceLine" id="cb8-15" title="15"><span class="co">    """</span></a> | ||||
| <a class="sourceLine" id="cb8-16" title="16">    ...</a></code></pre></div> | ||||
| <p>Database schema: 3 tables:</p> | ||||
| <ul> | ||||
| <li>delivery_from: smtpd, milters, qmgr</li> | ||||
| <li>delivery_to: smtp, virtual, bounce, error</li> | ||||
| <li>noqueue: rejected by smtpd before even getting a queue_id</li> | ||||
| </ul> | ||||
| <p>Table noqueue contains all the spam; for this we only use SQL INSERT, no ON CONFLICT ... UPDATE; it's faster.</p> | ||||
| <h2>Demo</h2> | ||||
| <pre><code>... | ||||
| </code></pre> | ||||
| <h2>Questions / Suggestions</h2> | ||||
| <ul> | ||||
| <li>Could you enhance speed by using prepared statements?</li> | ||||
| <li>Will old data be deleted (as required by GDPR)?</li> | ||||
| </ul> | ||||
| <p>Both were implemented after the talk.</p> | ||||
							
								
								
									
										340
									
								
								journal-postfix-doc/20191127_pyugat_talk.md
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										340
									
								
								journal-postfix-doc/20191127_pyugat_talk.md
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,340 @@ | |||
| # journal-postfix - A log parser for Postfix | ||||
| 
 | ||||
| Experiences from applying Python to the domain of bad old email. | ||||
| 
 | ||||
| ## Email ✉ | ||||
| 
 | ||||
|   * old technology (starting in the 70ies) | ||||
|   * [store-and-forward](https://en.wikipedia.org/wiki/Store_and_forward): sent != delivered to recipient | ||||
|   * non-delivery reasons: | ||||
|     * recipient over quota | ||||
|     * inexistent destination | ||||
|     * malware | ||||
|     * spam | ||||
|     * server problem | ||||
|     * ... | ||||
|   * permanent / non-permanent failure ([DSN ~ 5.X.Y / 4.X.Y](https://www.iana.org/assignments/smtp-enhanced-status-codes/smtp-enhanced-status-codes.xhtml)) | ||||
|   * non-delivery modes | ||||
|     * immediate reject on SMTP level | ||||
|     * delayed [bounce messages](https://en.wikipedia.org/wiki/Bounce_message) by [reporting MTA](https://upload.wikimedia.org/wikipedia/commons/a/a2/Bounce-DSN-MTA-names.png) - queueing (e.g., ~5d) before delivery failure notification | ||||
|     * discarding | ||||
|   * read receipts | ||||
|   * [Wikipedia: email tracking](https://en.wikipedia.org/wiki/Email_tracking) | ||||
| 
 | ||||
| ## [SMTP](https://en.wikipedia.org/wiki/SMTP) | ||||
| 
 | ||||
| [SMTP session example](https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol#SMTP_transport_example): | ||||
| envelope sender, envelope recipient may differ from From:, To: | ||||
| 
 | ||||
| Lists of error codes: | ||||
| 
 | ||||
|   * [SMTP and ESMTP](https://www.inmotionhosting.com/support/email/email-troubleshooting/smtp-and-esmtp-error-code-list) | ||||
|   * [SMTP](https://serversmtp.com/smtp-error/) | ||||
|   * [SMTP](https://info.webtoolhub.com/kb-a15-smtp-status-codes-smtp-error-codes-smtp-reply-codes.aspx) | ||||
| 
 | ||||
| Example of an error within a bounced email (Subject: Mail delivery failed: returning message to sender) | ||||
| 
 | ||||
|     SMTP error from remote server for TEXT command, host: smtpin.rzone.de (81.169.145.97) reason: 550 5.7.1 Refused by local policy. No SPAM please! | ||||
| 
 | ||||
|   * email users are continually asking for the fate of their emails (or those of their correspondents which should have arrived) | ||||
| 
 | ||||
| ## [Postfix](http://www.postfix.org) | ||||
| 
 | ||||
|   * popular [MTA](https://en.wikipedia.org/wiki/Message_transfer_agent) | ||||
|   * written in C | ||||
|   * logging to files / journald | ||||
|   * example log messages for a (non-)delivery + stats | ||||
| 
 | ||||
| ``` | ||||
| Nov 27 16:19:22 mail postfix/smtpd[18995]: connect from unknown[80.82.79.244] | ||||
| Nov 27 16:19:22 mail postfix/smtpd[18995]: NOQUEUE: reject: RCPT from unknown[80.82.79.244]: 454 4.7.1 <spameri@tiscali.it>: Relay access denied; from=<spameri@tiscali.it> to=<spameri@tiscali.it> proto=ESMTP helo=<WIN-G7CPHCGK247> | ||||
| Nov 27 16:19:22 mail postfix/smtpd[18995]: disconnect from unknown[80.82.79.244] ehlo=1 mail=1 rcpt=0/1 rset=1 quit=1 commands=4/5 | ||||
| 
 | ||||
| Nov 27 16:22:43 mail postfix/anvil[18997]: statistics: max connection rate 1/60s for (smtp:80.82.79.244) at Nov 27 16:19:22 | ||||
| Nov 27 16:22:43 mail postfix/anvil[18997]: statistics: max connection count 1 for (smtp:80.82.79.244) at Nov 27 16:19:22 | ||||
| Nov 27 16:22:43 mail postfix/anvil[18997]: statistics: max cache size 1 at Nov 27 16:19:22 | ||||
| 
 | ||||
| Nov 27 16:22:48 mail postfix/smtpd[18999]: connect from mail.cosmopool.net[2a01:4f8:160:20c1::10:107] | ||||
| Nov 27 16:22:49 mail postfix/smtpd[18999]: 47NQzY13DbzNWNQG: client=mail.cosmopool.net[2a01:4f8:160:20c1::10:107] | ||||
| Nov 27 16:22:49 mail postfix/cleanup[19003]: 47NQzY13DbzNWNQG: info: header Subject: Re: test from mail.cosmopool.net[2a01:4f8:160:20c1::10:107]; from=<ibu@cosmopool.net> to=<ibu@multiname.org> proto=ESMTP helo=<mail.cosmopool.net> | ||||
| Nov 27 16:22:49 mail postfix/cleanup[19003]: 47NQzY13DbzNWNQG: message-id=<d5154432-b984-d65a-30b3-38bde7e37af8@cosmopool.net> | ||||
| Nov 27 16:22:49 mail postfix/qmgr[29349]: 47NQzY13DbzNWNQG: from=<ibu@cosmopool.net>, size=1365, nrcpt=2 (queue active) | ||||
| Nov 27 16:22:49 mail postfix/smtpd[18999]: disconnect from mail.cosmopool.net[2a01:4f8:160:20c1::10:107] ehlo=1 mail=1 rcpt=2 data=1 quit=1 commands=6 | ||||
| Nov 27 16:22:50 mail postfix/lmtp[19005]: 47NQzY13DbzNWNQG: to=<ibu2@multiname.org>, relay=mail.multiname.org[private/dovecot-lmtp], delay=1.2, delays=0.56/0.01/0.01/0.63, dsn=2.0.0, status=sent (250 2.0.0 <ibu2@multiname.org> nV9iJ9mi3l0+SgAAZU03Dg Saved) | ||||
| Nov 27 16:22:50 mail postfix/lmtp[19005]: 47NQzY13DbzNWNQG: to=<ibu@multiname.org>, relay=mail.multiname.org[private/dovecot-lmtp], delay=1.2, delays=0.56/0.01/0.01/0.63, dsn=2.0.0, status=sent (250 2.0.0 <ibu@multiname.org> nV9iJ9mi3l0+SgAAZU03Dg:2 Saved) | ||||
| Nov 27 16:22:50 mail postfix/qmgr[29349]: 47NQzY13DbzNWNQG: removed | ||||
| ``` | ||||
| 
 | ||||
|   * [involved postfix components](http://www.postfix.org/OVERVIEW.html) | ||||
|     * smtpd (port 25: smtp, port 587: submission) | ||||
|     * cleanup | ||||
|     * smtp/lmtp | ||||
|   * missing log parser | ||||
| 
 | ||||
| ## Idea | ||||
| 
 | ||||
|   * follow log stream and write summarized delivery information to a database | ||||
|   * goal: spot delivery problems, collect delivery stats | ||||
|   * a GUI could then display the current delivery status to users | ||||
| 
 | ||||
| ## Why Python? | ||||
| 
 | ||||
|   * simple and fun language, clear and concise | ||||
|   * well suited for text processing | ||||
|   * libs available for systemd, PostgreSQL | ||||
|   * huge standard library (used here: datetime, re, yaml, argparse, select) | ||||
|   * speed sufficient? | ||||
| 
 | ||||
| ## Development iterations | ||||
| 
 | ||||
|   * hmm, easy task, might take a few days | ||||
|   * PoC: reading and polling from journal works as expected | ||||
|   * used postfix logfiles in syslog format and wrote regexps matching them iteratively | ||||
|   * separated parsing messages from extracting delivery information | ||||
|   * created a delivery table | ||||
|   * hmm, this is very slow, takes hours to process log messages from a few days (from a server with not much traffic) | ||||
|   * introduced polling timeout and SQL transactions handling several messages at once | ||||
|   * ... much faster | ||||
|   * looks fine, but wait... did I catch all syntax variants of Postfix log messages? | ||||
|   * looked into Postfix sources and almost got lost | ||||
|   * weeks of hard work identifying relevant log output directives | ||||
|   * completely rewrote parser to deal with the rich log msg syntax, e.g.:<br> | ||||
|     `def _strip_pattern(msg, res, pattern_name, pos='l', target_names=None)` | ||||
|   * oh, there are even more Postfix components... limit to certain Postfix configurations, in particular virtual mailboxes and not local ones | ||||
|   * mails may have multiple recipients... split delivery table into delivery_from and delivery_to | ||||
|   * decide which delivery information is relevant | ||||
|   * cleanup and polish (config mgmt, logging) | ||||
|   * write ansible role | ||||
| 
 | ||||
| ## Structure | ||||
| 
 | ||||
| ```blockdiag | ||||
| blockdiag { | ||||
|     default_fontsize = 20; | ||||
|     node_height = 80; | ||||
|     journal_since -> run_loop; | ||||
|     journal_follow -> run_loop; | ||||
|     logfile -> run_loop; | ||||
|     run_loop -> parse -> extract_delivery -> store; | ||||
|     store -> delivery_from; | ||||
|     store -> delivery_to; | ||||
|     store -> noqueue; | ||||
| 
 | ||||
|     group { label="input iterables"; journal_since; journal_follow; logfile; }; | ||||
|     group { label="output tables"; delivery_from; delivery_to; noqueue; }; | ||||
| } | ||||
| ``` | ||||
| 
 | ||||
| ## Iterables | ||||
| 
 | ||||
| ```python | ||||
| def iter_journal_messages_since(timestamp: Union[int, float]): | ||||
|     """ | ||||
|     Yield False and message details from the journal since *timestamp*. | ||||
| 
 | ||||
|     This is the loading phase (loading messages that already existed | ||||
|     when we start). | ||||
| 
 | ||||
|     Argument *timestamp* is a UNIX timestamp. | ||||
| 
 | ||||
|     Only journal entries for systemd unit UNITNAME with loglevel | ||||
|     INFO and above are retrieved. | ||||
|     """ | ||||
|     ... | ||||
| 
 | ||||
| def iter_journal_messages_follow(timestamp: Union[int, float]): | ||||
|     """ | ||||
|     Yield commit and message details from the journal through polling. | ||||
| 
 | ||||
|     This is the polling phase (after we have read pre-existing messages | ||||
|     in the loading phase). | ||||
| 
 | ||||
|     Argument *timestamp* is a UNIX timestamp. | ||||
| 
 | ||||
|     Only journal entries for systemd unit UNITNAME with loglevel | ||||
|     INFO and above are retrieved. | ||||
| 
 | ||||
|     *commit* (bool) tells whether it is time to store the delivery | ||||
|     information obtained from the messages yielded by us. | ||||
|     It is set to True if max_delay_before_commit has elapsed. | ||||
|     After this delay delivery information will be written; to be exact: | ||||
|     the delay may increase by up to one journal_poll_interval. | ||||
|     """ | ||||
|     ... | ||||
| 
 | ||||
| def iter_logfile_messages(filepath: str, year: int, | ||||
|                           commit_after_lines=max_messages_per_commit): | ||||
|     """ | ||||
|     Yield messages and a commit flag from a logfile. | ||||
| 
 | ||||
|     Loop through all lines of the file with given *filepath* and | ||||
|     extract the time and log message. If the log message starts | ||||
|     with 'postfix/', then extract the syslog_identifier, pid and | ||||
|     message text. | ||||
| 
 | ||||
|     Since syslog lines do not contain the year, the *year* to which | ||||
|     the first log line belongs must be given. | ||||
| 
 | ||||
|     Return a commit flag and a dict with these keys: | ||||
|         't': timestamp | ||||
|         'message': message text | ||||
|         'identifier': syslog identifier (e.g., 'postfix/smtpd') | ||||
|         'pid': process id | ||||
| 
 | ||||
|     The commit flag will be set to True for every | ||||
|     (commit_after_lines)-th filtered message and serves | ||||
|     as a signal to the caller to commit this chunk of data | ||||
|     to the database. | ||||
|     """ | ||||
|     ... | ||||
| ``` | ||||
| 
 | ||||
| ## Running loops | ||||
| 
 | ||||
| ```python | ||||
| def run(dsn, verp_marker=False, filepath=None, year=None, debug=[]): | ||||
|     """ | ||||
|     Determine loop(s) and run them within a database context. | ||||
|     """ | ||||
|     init(verp_marker=verp_marker) | ||||
|     with psycopg2.connect(dsn) as conn: | ||||
|         with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as curs: | ||||
|             if filepath: | ||||
|                 run_loop(iter_logfile_messages(filepath, year), curs, debug=debug) | ||||
|             else: | ||||
|                 begin_timestamp = get_latest_timestamp(curs) | ||||
|                 run_loop(iter_journal_messages_since(begin_timestamp), curs, debug=debug) | ||||
|                 begin_timestamp = get_latest_timestamp(curs) | ||||
|                 run_loop(iter_journal_messages_follow(begin_timestamp), curs, debug=debug) | ||||
| 
 | ||||
| def run_loop(iterable, curs, debug=[]): | ||||
|     """ | ||||
|     Loop over log messages obtained from *iterable*. | ||||
| 
 | ||||
|     Parse the message, extract delivery information from it and store | ||||
|     that delivery information. | ||||
| 
 | ||||
|     For performance reasons delivery items are collected in a cache | ||||
|     before writing them (i.e., committing a database transaction). | ||||
|     """ | ||||
|     cache = [] | ||||
|     msg_count = max_messages_per_commit | ||||
|     for commit, msg_details in iterable: | ||||
|         ... | ||||
| ``` | ||||
| 
 | ||||
| ## Parsing | ||||
| 
 | ||||
| Parse what you can. (But only msg_info in Postfix, and only relevant components.) | ||||
| 
 | ||||
| ```python | ||||
| def parse(msg_details, debug=False): | ||||
|     """ | ||||
|     Parse a log message returning a dict. | ||||
| 
 | ||||
|     *msg_details* is assumed to be a dict with these keys: | ||||
| 
 | ||||
|       * 'identifier' (syslog identifier), | ||||
|       * 'pid' (process id), | ||||
|       * 'message' (message text) | ||||
| 
 | ||||
|     The syslog identifier and process id are copied to the resulting dict. | ||||
|     """ | ||||
|     ... | ||||
| 
 | ||||
| def _parse_branch(comp, msg, res): | ||||
|     """ | ||||
|     Parse a log message string *msg*, adding results to dict *res*. | ||||
| 
 | ||||
|     Depending on the component *comp* we branch to functions | ||||
|     named _parse_{comp}. | ||||
| 
 | ||||
|     Add parsing results to dict *res*. Always add key 'action'. | ||||
|     Try to parse every syntactical element. | ||||
|     Note: We parse what we can. Assessment of parsing results relevant | ||||
|     for delivery is done in :func:`extract_delivery`. | ||||
|     """ | ||||
|     ... | ||||
| ``` | ||||
| 
 | ||||
| ## Extracting | ||||
| 
 | ||||
| Extract what is relevant. | ||||
| 
 | ||||
| ```python | ||||
| def extract_delivery(msg_details, parsed): | ||||
|     """ | ||||
|     Compute delivery information from parsing results. | ||||
| 
 | ||||
|     Basically this means that we map the parsed fields to | ||||
|     a type ('from' or 'to') and to the database | ||||
|     fields for table 'delivery_from' or 'delivery_to'. | ||||
| 
 | ||||
|     We branch to functions _extract_{comp} where comp is the | ||||
|     name of a Postfix component. | ||||
| 
 | ||||
|     Return a list of error strings and a dict with the | ||||
|     extracted information. Keys with None values are removed | ||||
|     from the dict. | ||||
|     """ | ||||
|     ... | ||||
| ``` | ||||
| 
 | ||||
| ## Regular expressions | ||||
| 
 | ||||
|   * see sources | ||||
| 
 | ||||
|   * [Stackoverflow: How to validate an email address](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression) [FSM](https://i.stack.imgur.com/YI6KR.png) | ||||
| 
 | ||||
| ### BTW: [email.utils.parseaddr](https://docs.python.org/3/library/email.utils.html#email.utils.parseaddr) | ||||
| 
 | ||||
| ```python | ||||
| >>> from email.utils import parseaddr | ||||
| >>> parseaddr('Ghost <"hello@nowhere"@pyug.at>') | ||||
| ('Ghost', '"hello@nowhere"@pyug.at') | ||||
| >>> print(parseaddr('"more\"fun\"\\"hello\\"@nowhere"@pyug.at')[1]) | ||||
| "more"fun"\"hello\"@nowhere"@pyug.at | ||||
| >>> print(parseaddr('""@pyug.at')[1]) | ||||
| ""@pyug.at | ||||
| ``` | ||||
| 
 | ||||
| ## Storing | ||||
| 
 | ||||
| ```python | ||||
| def store_deliveries(cursor, cache, debug=[]): | ||||
|     """ | ||||
|     Store cached delivery information into the database. | ||||
| 
 | ||||
|     Find queue_ids in *cache* and group delivery items by | ||||
|     them, but separately for delivery types 'from' and 'to'. | ||||
|     In addition, collect delivery items with queue_id is None. | ||||
| 
 | ||||
|     After grouping we merge all items withing a group into a | ||||
|     single item. So we can combine several SQL queries into  | ||||
|     a single one, which improves performance significantly. | ||||
| 
 | ||||
|     Then store the merged items and the deliveries with | ||||
|     queue_id is None. | ||||
|     """ | ||||
|     ... | ||||
| ``` | ||||
| 
 | ||||
| 
 | ||||
| Database schema: 3 tables: | ||||
| 
 | ||||
|   * delivery_from: smtpd, milters, qmgr | ||||
|   * delivery_to: smtp, virtual, bounce, error | ||||
|   * noqueue: rejected by smtpd before even getting a queue_id | ||||
| 
 | ||||
| Table noqueue contains all the spam; for this we only use SQL INSERT, no ON CONFLICT ... UPDATE; it's faster. | ||||
| 
 | ||||
| ## Demo | ||||
| 
 | ||||
|     ... | ||||
| 
 | ||||
| ## Questions / Suggestions | ||||
| 
 | ||||
|   * Could you enhance speed by using prepared statements? | ||||
|   * Will old data be deleted (as required by GDPR)? | ||||
| 
 | ||||
| Both were implemented after the talk. | ||||
							
								
								
									
										34
									
								
								journal-postfix.yml
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										34
									
								
								journal-postfix.yml
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,34 @@ | |||
| # Deploy journal-postfix | ||||
| 
 | ||||
| # This will install a service that writes mail delivery information | ||||
| # obtained from systemd-journal (unit postfix@-.service) to a | ||||
| # PostgreSQL database. | ||||
| # | ||||
| # You can configure the database connection parameters (and optionally | ||||
| # a verp_marker) as host vars like this: | ||||
| # | ||||
| # mailserver: | ||||
| #   postgresql: | ||||
| #     host: 127.0.0.1 | ||||
| #     port: 5432 | ||||
| #     dbname: mailserver | ||||
| #     username: mailserver | ||||
| #     password: !vault | | ||||
| #         $ANSIBLE_VAULT;1.1;AES256 | ||||
| #         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | ||||
| #         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | ||||
| #         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | ||||
| #         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | ||||
| #         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | ||||
| #   postfix: | ||||
| #     verp_marker: rstxyz | ||||
| # | ||||
| # If you do not, then you must edit /etc/journal-postfix/main.yml | ||||
| # on the destination hosts and run systemctl start journal-postfix | ||||
| # manually. | ||||
| 
 | ||||
| - name: install journal-postfix | ||||
|   user: root | ||||
|   hosts: mail | ||||
|   roles: | ||||
|     - journal-postfix | ||||
							
								
								
									
										17
									
								
								journal-postfix/files/journal-postfix.service
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										17
									
								
								journal-postfix/files/journal-postfix.service
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,17 @@ | |||
| # this file is part of ansible role journal-postfix | ||||
| 
 | ||||
| [Unit] | ||||
| Description=Extract postfix message delivery information from systemd journal messages\ | ||||
| and store them in a PostgreSQL database. Configuration is in /etc/journal-postfix/main.yml | ||||
| After=multi-user.target | ||||
| 
 | ||||
| [Service] | ||||
| Type=simple | ||||
| ExecStart=/srv/journal-postfix/run.py | ||||
| User=journal-postfix | ||||
| WorkingDirectory=/srv/journal-postfix/ | ||||
| Restart=on-failure | ||||
| RestartPreventExitStatus=97 | ||||
| 
 | ||||
| [Install] | ||||
| WantedBy=multi-user.target | ||||
							
								
								
									
										85
									
								
								journal-postfix/files/srv/README.md
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										85
									
								
								journal-postfix/files/srv/README.md
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,85 @@ | |||
| Parse postfix entries in systemd journal and collect delivery information. | ||||
| 
 | ||||
| The information on mail deliveries is written to tables in a PostgreSQL | ||||
| database. The database can then be queried by a UI showing delivery status | ||||
| to end users. The UI is not part of this package. | ||||
| 
 | ||||
| This software is tailor-made for debian buster with systemd as init system. | ||||
| It is meant to run on the same system on which Postfix is running, | ||||
| or on a system receiving the log stream of a Postfix instance in its | ||||
| systemd journal. | ||||
| 
 | ||||
| Prerequisites / Postfix configuration: | ||||
| 
 | ||||
|   - Access to a PostgreSQL database. | ||||
|   - Postfix: Only virtual mailboxes are supported. | ||||
|   - Postfix: You can use short or long queue_ids (see | ||||
|     http://www.postfix.org/postconf.5.html#enable_long_queue_ids), | ||||
|     but since the uniqueness of short queue_ids is very limited, | ||||
|     usage of long queue_ids is *strongly recommended*. | ||||
| 
 | ||||
| Installation: | ||||
| 
 | ||||
|   - apt install python3-psycopg2 python3-systemd python3-yaml | ||||
|   - Edit /etc/journal-postfix/main.yml | ||||
|   - Output is written to the journal (unit journal-postfix). READ IT! | ||||
| 
 | ||||
| Side effects (database): | ||||
| 
 | ||||
|   - The configured database user will create the tables | ||||
|     - delivery_from | ||||
|     - delivery_to | ||||
|     - noqueue | ||||
|     in the configured database, if they do not yet exist. | ||||
|     These tables will be filled with results from parsing the journal. | ||||
|     Table noqueue contains deliveries rejected by smtpd before they | ||||
|     got a queue_id. Deliveries with queue_id are in tables delivery_from | ||||
|     and delivery_to, which are separate, because an email can have only | ||||
|     one sender, but more than one recipient. Entries in both tables are | ||||
|     related through the queue_id and the approximate date; note that | ||||
|     short queue_ids are not unique for a delivery transaction, so | ||||
|     consider changing your Postfix configuration to long queue_ids. | ||||
|   - Log output is written to journald, unit journal-postfix. | ||||
| 
 | ||||
| Configuration: | ||||
| 
 | ||||
|   - Edit the config file in YAML format located at | ||||
|     /etc/journal-postfix/main.conf | ||||
| 
 | ||||
| Limitations: | ||||
| 
 | ||||
|   - The log output of Postfix may contain messages not primarily relevant | ||||
|     for delivery, namely messages of levels panic, fatal, error, warning. | ||||
|     They are discarded. | ||||
|   - The postfix server must be configured to use virtual mailboxes; | ||||
|     deliveries to local mailboxes are ignored. | ||||
|   - Parsing is specific to a Postfix version and only version 3.4.5 | ||||
|     (the version in Debian buster) is supported; it is intended to support | ||||
|     Postfix versions in future stable Debian releases. | ||||
|   - This script does not support concurrency; we assume that there is only | ||||
|     one process writing to the database tables. Thus clustered postfix | ||||
|     setups are not supported. | ||||
| 
 | ||||
| Options: | ||||
| 
 | ||||
|   - If you use dovecot as lmtpd, you will also get dovecot_ids upon | ||||
|     successful delivery. | ||||
|   - If you have configured Postfix to store VERP-ids of outgoing mails | ||||
|     in table 'mail_from' in the same database, then bounce emails can | ||||
|     be associated with original emails. The VERP-ids must have a certain | ||||
|     format. | ||||
|   - The subject of emails will be extracted from log messages starting | ||||
|     with "info: header Subject:". To enable these messages configure | ||||
|     Postfix like this: Enabled header_checks in main.cf ( | ||||
|         header_checks = regexp:/etc/postfix/header_checks | ||||
|     ) and put this line into /etc/postfix/header_checks: | ||||
|         /^Subject:/ INFO | ||||
|   - You can also import log messages from a log file in syslog format: | ||||
|     Run this script directly from command line with options --file | ||||
|     (the path to the file to be parsed) and --year (the year of the | ||||
|     first message in this log file). | ||||
|     Note: For the name of the month to be recognized correctly, the | ||||
|     script must be run with this locale. | ||||
|     Attention: When running from the command line, log output will | ||||
|     not be sent to unit journal-postfix; use this command instead: | ||||
|     journalctl --follow SYSLOG_IDENTIFIER=python3 | ||||
							
								
								
									
										1514
									
								
								journal-postfix/files/srv/parser.py
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										1514
									
								
								journal-postfix/files/srv/parser.py
									
										
									
									
									
										Normal file
									
								
							
										
											
												File diff suppressed because it is too large
												Load diff
											
										
									
								
							
							
								
								
									
										212
									
								
								journal-postfix/files/srv/run.py
									
										
									
									
									
										Executable file
									
								
							
							
						
						
									
										212
									
								
								journal-postfix/files/srv/run.py
									
										
									
									
									
										Executable file
									
								
							|  | @ -0,0 +1,212 @@ | |||
| #!/usr/bin/env python3 | ||||
| 
 | ||||
| """ | ||||
| Main script to be run as a systemd unit or manually. | ||||
| """ | ||||
| 
 | ||||
| import argparse | ||||
| import datetime | ||||
| import os | ||||
| import sys | ||||
| from pprint import pprint | ||||
| from typing import Iterable, List, Optional, Tuple, Union | ||||
| import psycopg2 | ||||
| import psycopg2.extras | ||||
| from systemd import journal | ||||
| import settings | ||||
| from parser import init_parser, parse_entry, extract_delivery | ||||
| from sources import ( | ||||
|     iter_journal_messages_since, | ||||
|     iter_journal_messages_follow, | ||||
|     iter_logfile_messages, | ||||
| ) | ||||
| from storage import ( | ||||
|     init_db, | ||||
|     init_session, | ||||
|     get_latest_timestamp, | ||||
|     delete_old_deliveries, | ||||
|     store_delivery_items, | ||||
| ) | ||||
| 
 | ||||
| 
 | ||||
| exit_code_without_restart = 97 | ||||
| 
 | ||||
| 
 | ||||
| def run( | ||||
|     dsn: str, | ||||
|     verp_marker: Optional[str] = None, | ||||
|     filepath: Optional[str] = None, | ||||
|     year: Optional[int] = None, | ||||
|     debug: List[str] = [], | ||||
| ) -> None: | ||||
|     """ | ||||
|     Determine loop(s) and run them within a database context. | ||||
|     """ | ||||
|     init_parser(verp_marker=verp_marker) | ||||
|     with psycopg2.connect(dsn) as conn: | ||||
|         with conn.cursor( | ||||
|             cursor_factory=psycopg2.extras.RealDictCursor | ||||
|         ) as curs: | ||||
|             init_session(curs) | ||||
|             if filepath and year: | ||||
|                 run_loop( | ||||
|                     iter_logfile_messages(filepath, year), curs, debug=debug | ||||
|                 ) | ||||
|             else: | ||||
|                 begin_timestamp = get_latest_timestamp(curs) | ||||
|                 run_loop( | ||||
|                     iter_journal_messages_since(begin_timestamp), | ||||
|                     curs, | ||||
|                     debug=debug, | ||||
|                 ) | ||||
|                 begin_timestamp = get_latest_timestamp(curs) | ||||
|                 run_loop( | ||||
|                     iter_journal_messages_follow(begin_timestamp), | ||||
|                     curs, | ||||
|                     debug=debug, | ||||
|                 ) | ||||
| 
 | ||||
| 
 | ||||
| def run_loop( | ||||
|     iterable: Iterable[Tuple[bool, Optional[dict]]], | ||||
|     curs: psycopg2.extras.RealDictCursor, | ||||
|     debug: List[str] = [] | ||||
| ) -> None: | ||||
|     """ | ||||
|     Loop over log entries obtained from *iterable*. | ||||
| 
 | ||||
|     Parse the message, extract delivery information from it and store | ||||
|     that delivery information. | ||||
| 
 | ||||
|     For performance reasons delivery items are collected in a cache | ||||
|     before writing them (i.e., committing a database transaction). | ||||
|     """ | ||||
|     cache = [] | ||||
|     msg_count = settings.max_messages_per_commit | ||||
|     last_delete = None | ||||
|     for commit, msg_details in iterable: | ||||
|         parsed_entry = None | ||||
|         if msg_details: | ||||
|             parsed_entry = parse_entry(msg_details) | ||||
|             if 'all' in debug or ( | ||||
|                 parsed_entry and parsed_entry.get('comp') in debug | ||||
|             ): | ||||
|                 print('_' * 80) | ||||
|                 print('MSG_DETAILS:', msg_details) | ||||
|                 print('PARSED_ENTRY:', parsed_entry) | ||||
|             if parsed_entry: | ||||
|                 errors, delivery = extract_delivery(msg_details, parsed_entry) | ||||
|                 if not errors and delivery: | ||||
|                     if 'all' in debug or parsed_entry.get('comp') in debug: | ||||
|                         print('DELIVERY:') | ||||
|                         pprint(delivery) | ||||
|                     # it may happen that a delivery of type 'from' has | ||||
|                     # a recipient; in this case add a second delivery | ||||
|                     # of type 'to' to the cache, but only for deliveries | ||||
|                     # with queue_id | ||||
|                     if ( | ||||
|                         delivery['type'] == 'from' | ||||
|                         and 'recipient' in delivery | ||||
|                         and delivery.get('queue_id') | ||||
|                     ): | ||||
|                         delivery2 = delivery.copy() | ||||
|                         delivery2['type'] = 'to' | ||||
|                         cache.append(delivery2) | ||||
|                         del delivery['recipient'] | ||||
|                     cache.append(delivery) | ||||
|                     msg_count -= 1 | ||||
|                     if msg_count == 0: | ||||
|                         commit = True | ||||
|                 elif errors: | ||||
|                     msg = ( | ||||
|                         f'Extracting delivery from parsed entry failed: ' | ||||
|                         f'errors={errors}; msg_details={msg_details}; ' | ||||
|                         f'parsed_entry={parsed_entry}' | ||||
|                     ) | ||||
|                     journal.send(msg, PRIORITY=journal.LOG_CRIT) | ||||
|                     if 'all' in debug or parsed_entry.get('comp') in debug: | ||||
|                         print('EXTRACTION ERRORS:', errors) | ||||
|         if commit: | ||||
|             if 'all' in debug: | ||||
|                 print('.' * 40, 'committing') | ||||
|             # store cache, clear cache, reset message counter | ||||
|             store_delivery_items(curs, cache, debug=debug) | ||||
|             cache = [] | ||||
|             msg_count = settings.max_messages_per_commit | ||||
|         now = datetime.datetime.utcnow() | ||||
|         if last_delete is None or last_delete < now - settings.delete_interval: | ||||
|             delete_old_deliveries(curs) | ||||
|             last_delete = now | ||||
|             if 'all' in debug: | ||||
|                 print('.' * 40, 'deleting old deliveries') | ||||
|     else: | ||||
|         store_delivery_items(curs, cache, debug=debug) | ||||
| 
 | ||||
| 
 | ||||
| def main() -> None: | ||||
|     parser = argparse.ArgumentParser() | ||||
|     parser.add_argument( | ||||
|         '--debug', | ||||
|         help='Comma-separated list of components to be debugged; ' | ||||
|         'valid component names are the Postfix components ' | ||||
|         'plus "sql" plus "all".', | ||||
|     ) | ||||
|     parser.add_argument( | ||||
|         '--file', | ||||
|         help='File path of a Postfix logfile in syslog ' | ||||
|         'format to be parsed instead of the journal', | ||||
|     ) | ||||
|     parser.add_argument( | ||||
|         '--year', | ||||
|         help='If --file is given, we need to know ' | ||||
|         'the year of the first line in the logfile', | ||||
|     ) | ||||
|     args = parser.parse_args() | ||||
| 
 | ||||
|     config = settings.get_config() | ||||
|     if config: | ||||
|         # check if startup is enabled or fail | ||||
|         msg = None | ||||
|         if 'startup' not in config: | ||||
|             msg = 'Parameter "startup" is not configured.' | ||||
|         elif not config['startup']: | ||||
|             msg = 'Startup is not enabled in the config file.' | ||||
|         if msg: | ||||
|             journal.send(msg, PRIORITY=journal.LOG_CRIT) | ||||
|             sys.exit(exit_code_without_restart) | ||||
|         # check more params and call run | ||||
|         try: | ||||
|             verp_marker = config['postfix']['verp_marker'] | ||||
|         except Exception: | ||||
|             verp_marker = None | ||||
|         debug: List[str] = [] | ||||
|         if args.debug: | ||||
|             debug = args.debug.split(',') | ||||
|         filepath = None | ||||
|         year = None | ||||
|         if args.file: | ||||
|             filepath = args.file | ||||
|             if not args.year: | ||||
|                 print( | ||||
|                     'If --file is given, we need to know the year' | ||||
|                     ' of the first line in the logfile. Please use --year.' | ||||
|                 ) | ||||
|                 sys.exit(1) | ||||
|             else: | ||||
|                 year = int(args.year) | ||||
|         dsn = init_db(config) | ||||
|         if dsn: | ||||
|             run( | ||||
|                 dsn, | ||||
|                 verp_marker=verp_marker, | ||||
|                 filepath=filepath, | ||||
|                 year=year, | ||||
|                 debug=debug, | ||||
|             ) | ||||
|     else: | ||||
|         print('Config invalid, see journal.') | ||||
|         sys.exit(exit_code_without_restart) | ||||
| 
 | ||||
| 
 | ||||
| if __name__ == '__main__': | ||||
|     main() | ||||
							
								
								
									
										125
									
								
								journal-postfix/files/srv/settings.py
									
										
									
									
									
										Executable file
									
								
							
							
						
						
									
										125
									
								
								journal-postfix/files/srv/settings.py
									
										
									
									
									
										Executable file
									
								
							|  | @ -0,0 +1,125 @@ | |||
| #!/usr/bin/env python3 | ||||
| 
 | ||||
| """ | ||||
| Settings for journal-postfix. | ||||
| """ | ||||
| 
 | ||||
| import os | ||||
| import datetime | ||||
| from typing import Union, Optional | ||||
| from systemd import journal | ||||
| from yaml import load | ||||
| 
 | ||||
| 
 | ||||
| main_config_file: str = '/etc/journal-postfix/main.yml' | ||||
| """ | ||||
| Filepath to the main config file. | ||||
| 
 | ||||
| Can be overriden by environment variable JOURNAL_POSTFIX_MAIN_CONF. | ||||
| """ | ||||
| 
 | ||||
| 
 | ||||
| systemd_unitname: str = 'postfix@-.service' | ||||
| """ | ||||
| Name of the systemd unit running the postfix service. | ||||
| """ | ||||
| 
 | ||||
| 
 | ||||
| journal_poll_interval: Union[float, int] = 10.0 | ||||
| """ | ||||
| Poll timeout in seconds for fetching messages from the journal. | ||||
| 
 | ||||
| Will be overriden if set in the main config. | ||||
| 
 | ||||
| If the poll times out, it is checked whether the last commit | ||||
| lies more than max_delay_before_commit seconds in the past; | ||||
| if so, the current database transaction will be committed. | ||||
| """ | ||||
| 
 | ||||
| 
 | ||||
| max_delay_before_commit: datetime.timedelta = datetime.timedelta(seconds=30) | ||||
| """ | ||||
| How much time may pass before committing a database transaction? | ||||
| 
 | ||||
| Will be overriden if set in the main config. | ||||
| 
 | ||||
| (The actual maximal delay can be one journal_poll_interval in addition.) | ||||
| """ | ||||
| 
 | ||||
| 
 | ||||
| max_messages_per_commit: int = 1000 | ||||
| """ | ||||
| How many messages to cache at most before committing a database transaction? | ||||
| 
 | ||||
| Will be overriden if set in the main config. | ||||
| """ | ||||
| 
 | ||||
| 
 | ||||
| delete_deliveries_after_days: int = 0 | ||||
| """ | ||||
| After how many days shall deliveries be deleted from the database? | ||||
| 
 | ||||
| A value of 0 means that data are never deleted. | ||||
| """ | ||||
| 
 | ||||
| 
 | ||||
| def get_config() -> Optional[dict]: | ||||
|     """ | ||||
|     Load config from the main config and return it. | ||||
| 
 | ||||
|     The default main config file path (global main_config_file) | ||||
|     can be overriden with environment variable | ||||
|     JOURNAL_POSTFIX_MAIN_CONF. | ||||
|     """ | ||||
|     try: | ||||
|         filename = os.environ['JOURNAL_POSTFIX_MAIN_CONF'] | ||||
|         global main_config_file | ||||
|         main_config_file = filename | ||||
|     except Exception: | ||||
|         filename = main_config_file | ||||
|     try: | ||||
|         with open(filename, 'r') as config_file: | ||||
|             config_raw = config_file.read() | ||||
|     except Exception: | ||||
|         msg = f'ERROR: cannot read config file {filename}' | ||||
|         journal.send(msg, PRIORITY=journal.LOG_CRIT) | ||||
|         return None | ||||
|     try: | ||||
|         config = load(config_raw) | ||||
|     except Exception as err: | ||||
|         msg = f'ERROR: invalid yaml syntax in {filename}: {err}' | ||||
|         journal.send(msg, PRIORITY=journal.LOG_CRIT) | ||||
|         return None | ||||
|     # override some global variables | ||||
|     _global_value_from_config(config['postfix'], 'systemd_unitname', str) | ||||
|     _global_value_from_config(config, 'journal_poll_interval', float) | ||||
|     _global_value_from_config(config, 'max_delay_before_commit', 'seconds') | ||||
|     _global_value_from_config(config, 'max_messages_per_commit', int) | ||||
|     _global_value_from_config(config, 'delete_deliveries_after_days', int) | ||||
|     _global_value_from_config(config, 'delete_interval', 'seconds') | ||||
|     return config | ||||
| 
 | ||||
| 
 | ||||
| def _global_value_from_config( | ||||
|     config, name: str, type_: Union[type, str] | ||||
| ) -> None: | ||||
|     """ | ||||
|     Set a global variable to the value obtained from *config*. | ||||
| 
 | ||||
|     Also cast to *type_*. | ||||
|     """ | ||||
|     try: | ||||
|         value = config.get(name) | ||||
|         if type_ == 'seconds': | ||||
|             value = datetime.timedelta(seconds=float(value)) | ||||
|         else: | ||||
|             value = type_(value)  # type: ignore | ||||
|         globals()[name] = value | ||||
|     except Exception: | ||||
|         if value is not None: | ||||
|             msg = f'ERROR: configured value of {name} is invalid.' | ||||
|             journal.send(msg, PRIORITY=journal.LOG_ERR) | ||||
| 
 | ||||
| 
 | ||||
| if __name__ == '__main__': | ||||
|     print(get_config()) | ||||
							
								
								
									
										5
									
								
								journal-postfix/files/srv/setup.cfg
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										5
									
								
								journal-postfix/files/srv/setup.cfg
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,5 @@ | |||
| [pycodestyle] | ||||
| max-line-length = 200 | ||||
| 
 | ||||
| [mypy] | ||||
| ignore_missing_imports = True | ||||
							
								
								
									
										178
									
								
								journal-postfix/files/srv/sources.py
									
										
									
									
									
										Executable file
									
								
							
							
						
						
									
										178
									
								
								journal-postfix/files/srv/sources.py
									
										
									
									
									
										Executable file
									
								
							|  | @ -0,0 +1,178 @@ | |||
| #!/usr/bin/env python3 | ||||
| 
 | ||||
| """ | ||||
| Data sources. | ||||
| 
 | ||||
| Note: python-systemd journal docs are at | ||||
| https://www.freedesktop.org/software/systemd/python-systemd/journal.html | ||||
| """ | ||||
| 
 | ||||
| import datetime | ||||
| import select | ||||
| from typing import Iterable, Optional, Tuple, Union | ||||
| from systemd import journal | ||||
| import settings | ||||
| 
 | ||||
| 
 | ||||
| def iter_journal_messages_since( | ||||
|     timestamp: Union[int, float] | ||||
| ) -> Iterable[Tuple[bool, dict]]: | ||||
|     """ | ||||
|     Yield False and message details from the journal since *timestamp*. | ||||
| 
 | ||||
|     This is the loading phase (loading messages that already existed | ||||
|     when we start). | ||||
| 
 | ||||
|     Argument *timestamp* is a UNIX timestamp. | ||||
| 
 | ||||
|     Only journal entries for systemd unit settings.systemd_unitname with | ||||
|     loglevel INFO and above are retrieved. | ||||
|     """ | ||||
|     timestamp = float(timestamp) | ||||
|     sdj = journal.Reader() | ||||
|     sdj.log_level(journal.LOG_INFO) | ||||
|     sdj.add_match(_SYSTEMD_UNIT=settings.systemd_unitname) | ||||
|     sdj.seek_realtime(timestamp) | ||||
|     for entry in sdj: | ||||
|         yield False, _get_msg_details(entry) | ||||
| 
 | ||||
| 
 | ||||
| def iter_journal_messages_follow( | ||||
|     timestamp: Union[int, float] | ||||
| ) -> Iterable[Tuple[bool, Optional[dict]]]: | ||||
|     """ | ||||
|     Yield commit and message details from the journal through polling. | ||||
| 
 | ||||
|     This is the polling phase (after we have read pre-existing messages | ||||
|     in the loading phase). | ||||
| 
 | ||||
|     Argument *timestamp* is a UNIX timestamp. | ||||
| 
 | ||||
|     Only journal entries for systemd unit settings.systemd_unitname with | ||||
|     loglevel INFO and above are retrieved. | ||||
| 
 | ||||
|     *commit* (bool) tells whether it is time to store the delivery | ||||
|     information obtained from the messages yielded by us. | ||||
|     It is set to True if settings.max_delay_before_commit has elapsed. | ||||
|     After this delay delivery information will be written; to be exact: | ||||
|     the delay may increase by up to one settings.journal_poll_interval. | ||||
|     """ | ||||
|     sdj = journal.Reader() | ||||
|     sdj.log_level(journal.LOG_INFO) | ||||
|     sdj.add_match(_SYSTEMD_UNIT=settings.systemd_unitname) | ||||
|     sdj.seek_realtime(timestamp) | ||||
|     p = select.poll() | ||||
|     p.register(sdj, sdj.get_events()) | ||||
|     last_commit = datetime.datetime.utcnow() | ||||
|     interval_ms = settings.journal_poll_interval * 1000 | ||||
|     while True: | ||||
|         p.poll(interval_ms) | ||||
|         commit = False | ||||
|         now = datetime.datetime.utcnow() | ||||
|         if last_commit + settings.max_delay_before_commit < now: | ||||
|             commit = True | ||||
|             last_commit = now | ||||
|         if sdj.process() == journal.APPEND: | ||||
|             for entry in sdj: | ||||
|                 yield commit, _get_msg_details(entry) | ||||
|         elif commit: | ||||
|             yield commit, None | ||||
| 
 | ||||
| 
 | ||||
| def iter_logfile_messages( | ||||
|     filepath: str, | ||||
|     year: int, | ||||
|     commit_after_lines=settings.max_messages_per_commit, | ||||
| ) -> Iterable[Tuple[bool, dict]]: | ||||
|     """ | ||||
|     Yield messages and a commit flag from a logfile. | ||||
| 
 | ||||
|     Loop through all lines of the file with given *filepath* and | ||||
|     extract the time and log message. If the log message starts | ||||
|     with 'postfix/', then extract the syslog_identifier, pid and | ||||
|     message text. | ||||
| 
 | ||||
|     Since syslog lines do not contain the year, the *year* to which | ||||
|     the first log line belongs must be given. | ||||
| 
 | ||||
|     Return a commit flag and a dict with these keys: | ||||
|         't': timestamp | ||||
|         'message': message text | ||||
|         'identifier': syslog identifier (e.g., 'postfix/smtpd') | ||||
|         'pid': process id | ||||
| 
 | ||||
|     The commit flag will be set to True for every | ||||
|     (commit_after_lines)-th filtered message and serves | ||||
|     as a signal to the caller to commit this chunk of data | ||||
|     to the database. | ||||
|     """ | ||||
|     dt = None | ||||
|     with open(filepath, 'r') as fh: | ||||
|         cnt = 0 | ||||
|         while True: | ||||
|             line = fh.readline() | ||||
|             if not line: | ||||
|                 break | ||||
| 
 | ||||
|             # get datetime | ||||
|             timestamp = line[:15] | ||||
|             dt_prev = dt | ||||
|             dt = _parse_logfile_timestamp(timestamp, year) | ||||
|             if dt is None: | ||||
|                 continue  # discard log message with invalid timestamp | ||||
| 
 | ||||
|             # if we transgress a year boundary, then increment the year | ||||
|             if dt_prev and dt + datetime.timedelta(days=1) < dt_prev: | ||||
|                 year += 1 | ||||
|                 dt = _parse_logfile_timestamp(timestamp, year) | ||||
| 
 | ||||
|             # filter postfix messages | ||||
|             msg = line[21:].strip() | ||||
|             if 'postfix/' in msg: | ||||
|                 cnt += 1 | ||||
|                 syslog_identifier, msg_ = msg.split('[', 1) | ||||
|                 pid, msg__ = msg_.split(']', 1) | ||||
|                 message = msg__[2:] | ||||
|                 commit = cnt % commit_after_lines == 0 | ||||
|                 yield commit, { | ||||
|                     't': dt, | ||||
|                     'message': message, | ||||
|                     'identifier': syslog_identifier, | ||||
|                     'pid': pid, | ||||
|                 } | ||||
| 
 | ||||
| 
 | ||||
| def _get_msg_details(journal_entry: dict) -> dict: | ||||
|     """ | ||||
|     Return information extracted from a journal entry object as a dict. | ||||
|     """ | ||||
|     return { | ||||
|         't': journal_entry['__REALTIME_TIMESTAMP'], | ||||
|         'message': journal_entry['MESSAGE'], | ||||
|         'identifier': journal_entry.get('SYSLOG_IDENTIFIER'), | ||||
|         'pid': journal_entry.get('SYSLOG_PID'), | ||||
|     } | ||||
| 
 | ||||
| 
 | ||||
| def _parse_logfile_timestamp( | ||||
|     timestamp: Optional[str], | ||||
|     year: int | ||||
| ) -> Optional[datetime.datetime]: | ||||
|     """ | ||||
|     Parse a given syslog *timestamp* and return a datetime. | ||||
| 
 | ||||
|     Since the timestamp does not contain the year, it is an | ||||
|     extra argument. | ||||
| 
 | ||||
|     Note: Successful parsing og the month's name depends on | ||||
|     the locale under which this script runs. | ||||
|     """ | ||||
|     if timestamp is None: | ||||
|         return None | ||||
|     try: | ||||
|         timestamp = timestamp.replace('  ', ' ') | ||||
|         t1 = datetime.datetime.strptime(timestamp, '%b %d %H:%M:%S') | ||||
|         t2 = t1.replace(year=year) | ||||
|         return t2 | ||||
|     except Exception: | ||||
|         return None | ||||
							
								
								
									
										337
									
								
								journal-postfix/files/srv/storage.py
									
										
									
									
									
										Executable file
									
								
							
							
						
						
									
										337
									
								
								journal-postfix/files/srv/storage.py
									
										
									
									
									
										Executable file
									
								
							|  | @ -0,0 +1,337 @@ | |||
| #!/usr/bin/env python3 | ||||
| 
 | ||||
| """ | ||||
| Storage to PostgreSQL. | ||||
| """ | ||||
| 
 | ||||
| import datetime | ||||
| import json | ||||
| import re | ||||
| import time | ||||
| from collections import defaultdict | ||||
| from traceback import format_exc | ||||
| from typing import Any, Dict, Iterable, List, Optional, Tuple, Union | ||||
| import psycopg2 | ||||
| import psycopg2.extras | ||||
| from systemd import journal | ||||
| import settings | ||||
| from storage_setup import ( | ||||
|     get_create_table_stmts, | ||||
|     get_sql_prepared_statement, | ||||
|     get_sql_execute_prepared_statement, | ||||
|     table_fields, | ||||
| ) | ||||
| 
 | ||||
| 
 | ||||
| def get_latest_timestamp(curs: psycopg2.extras.RealDictCursor) -> int: | ||||
|     """ | ||||
|     Fetch the latest timestamp from the database. | ||||
| 
 | ||||
|     Return the latest timestamp of a message transfer from the database. | ||||
|     If there are no records yet, return 0. | ||||
|     """ | ||||
|     last = 0 | ||||
|     curs.execute( | ||||
|         "SELECT greatest(max(t_i), max(t_f)) AS last FROM delivery_from" | ||||
|     ) | ||||
|     last1 = curs.fetchone()['last'] | ||||
|     if last1: | ||||
|         last = max( | ||||
|             last, (last1 - datetime.datetime(1970, 1, 1)).total_seconds() | ||||
|         ) | ||||
|     curs.execute( | ||||
|         "SELECT greatest(max(t_i), max(t_f)) AS last FROM delivery_to" | ||||
|     ) | ||||
|     last2 = curs.fetchone()['last'] | ||||
|     if last2: | ||||
|         last = max( | ||||
|             last, (last2 - datetime.datetime(1970, 1, 1)).total_seconds() | ||||
|         ) | ||||
|     return last | ||||
| 
 | ||||
| 
 | ||||
| def delete_old_deliveries(curs: psycopg2.extras.RealDictCursor) -> None: | ||||
|     """ | ||||
|     Delete deliveries older than the configured number of days. | ||||
| 
 | ||||
|     See config param *delete_deliveries_after_days*. | ||||
|     """ | ||||
|     max_days = settings.delete_deliveries_after_days | ||||
|     if max_days: | ||||
|         now = datetime.datetime.utcnow() | ||||
|         dt = datetime.timedelta(days=max_days) | ||||
|         t0 = now - dt | ||||
|         curs.execute("DELETE FROM delivery_from WHERE t_i < %s", (t0,)) | ||||
|         curs.execute("DELETE FROM delivery_to WHERE t_i < %s", (t0,)) | ||||
|         curs.execute("DELETE FROM noqueue WHERE t < %s", (t0,)) | ||||
| 
 | ||||
| 
 | ||||
| def store_delivery_items( | ||||
|     cursor, | ||||
|     cache: List[dict], | ||||
|     debug: List[str] = [] | ||||
| ) -> None: | ||||
|     """ | ||||
|     Store cached delivery items into the database. | ||||
| 
 | ||||
|     Find queue_ids in *cache* and group delivery items by | ||||
|     them, but separately for delivery types 'from' and 'to'. | ||||
|     In addition, collect delivery items with queue_id is None. | ||||
| 
 | ||||
|     After grouping we merge all items withing a group into a | ||||
|     single item. So we can combine several SQL queries into | ||||
|     a single one, which improves performance significantly. | ||||
| 
 | ||||
|     Then store the merged items and the deliveries with | ||||
|     queue_id is None. | ||||
|     """ | ||||
|     if 'all' in debug or 'sql' in debug: | ||||
|         print(f'Storing {len(cache)} messages.') | ||||
|     if not cache: | ||||
|         return | ||||
|     from_items, to_items, noqueue_items = _group_delivery_items(cache) | ||||
|     deliveries_from = _merge_delivery_items(from_items, item_type='from') | ||||
|     deliveries_to = _merge_delivery_items(to_items, item_type='to') | ||||
|     _store_deliveries(cursor, 'delivery_from', deliveries_from, debug=debug) | ||||
|     _store_deliveries(cursor, 'delivery_to', deliveries_to, debug=debug) | ||||
|     _store_deliveries(cursor, 'noqueue', noqueue_items, debug=debug) | ||||
| 
 | ||||
| 
 | ||||
| FromItems = Dict[str, List[dict]] | ||||
| 
 | ||||
| 
 | ||||
| ToItems = Dict[Tuple[str, Optional[str]], List[dict]] | ||||
| 
 | ||||
| 
 | ||||
| NoqueueItems = Dict[int, dict] | ||||
| 
 | ||||
| 
 | ||||
| def _group_delivery_items( | ||||
|     cache: List[dict] | ||||
| ) -> Tuple[FromItems, ToItems, NoqueueItems]: | ||||
|     """ | ||||
|     Group delivery items by type and queue_id. | ||||
| 
 | ||||
|     Return items of type 'from', of type 'to' and items without | ||||
|     queue_id. | ||||
|     """ | ||||
|     delivery_from_items: FromItems = defaultdict(list) | ||||
|     delivery_to_items: ToItems = defaultdict(list) | ||||
|     noqueue_items: NoqueueItems = {} | ||||
|     noqueue_i = 1 | ||||
|     for item in cache: | ||||
|         if item.get('queue_id'): | ||||
|             queue_id = item['queue_id'] | ||||
|             if item.get('type') == 'from': | ||||
|                 delivery_from_items[queue_id].append(item) | ||||
|             else: | ||||
|                 recipient = item.get('recipient') | ||||
|                 delivery_to_items[(queue_id, recipient)].append(item) | ||||
|         else: | ||||
|             noqueue_items[noqueue_i] = item | ||||
|             noqueue_i += 1 | ||||
|     return delivery_from_items, delivery_to_items, noqueue_items | ||||
| 
 | ||||
| 
 | ||||
| def _merge_delivery_items( | ||||
|     delivery_items: Union[FromItems, ToItems], | ||||
|     item_type: str = 'from', | ||||
| ) -> Dict[Union[str, Tuple[str, Optional[str]]], dict]: | ||||
|     """ | ||||
|     Compute deliveries by combining multiple delivery items. | ||||
| 
 | ||||
|     Take lists of delivery items for each queue_id (in case | ||||
|     of item_type=='from') or for (queue_id, recipient)-pairs | ||||
|     (in case of item_type='to'). | ||||
|     Each delivery item is a dict obtained from one log message. | ||||
|     The dicts are consecutively updated (merged), except for the | ||||
|     raw log messages (texts) which are collected into a list. | ||||
|     The fields of the resulting delivery are filtered according | ||||
|     to the target table. | ||||
|     Returned is a dict mapping queue_ids (in case | ||||
|     of item_type=='from') or (queue_id, recipient)-pairs | ||||
|     (in case of item_type='to') to deliveries. | ||||
|     """ | ||||
|     deliveries = {} | ||||
|     for group, items in delivery_items.items(): | ||||
|         delivery = {} | ||||
|         messages = [] | ||||
|         for item in items: | ||||
|             message = item.pop('message') | ||||
|             identifier = item.pop('identifier') | ||||
|             pid = item.pop('pid') | ||||
|             messages.append(f'{identifier}[{pid}]: {message}') | ||||
|             delivery.update(item) | ||||
|         delivery['messages'] = messages | ||||
|         deliveries[group] = delivery | ||||
|     return deliveries | ||||
| 
 | ||||
| 
 | ||||
| def _store_deliveries( | ||||
|     cursor: psycopg2.extras.RealDictCursor, | ||||
|     table_name: str, | ||||
|     deliveries: Dict[Any, dict], | ||||
|     debug: List[str] = [], | ||||
| ) -> None: | ||||
|     """ | ||||
|     Store grouped and merged delivery items. | ||||
|     """ | ||||
|     if not deliveries: | ||||
|         return | ||||
|     n = len(deliveries.values()) | ||||
|     t0 = time.time() | ||||
|     cursor.execute('BEGIN') | ||||
|     _store_deliveries_batch(cursor, table_name, deliveries.values()) | ||||
|     cursor.execute('COMMIT') | ||||
|     t1 = time.time() | ||||
|     if 'all' in debug or 'sql' in debug: | ||||
|         milliseconds = (t1 - t0) * 1000 | ||||
|         print( | ||||
|             '*' * 10, | ||||
|             f'SQL transaction time {table_name}: ' | ||||
|             f'{milliseconds:.2f} ms ({n} deliveries)', | ||||
|         ) | ||||
| 
 | ||||
| 
 | ||||
| def _store_deliveries_batch( | ||||
|     cursor: psycopg2.extras.RealDictCursor, | ||||
|     table_name: str, | ||||
|     deliveries: Iterable[dict] | ||||
| ) -> None: | ||||
|     """ | ||||
|     Store *deliveries* (i.e., grouped and merged delivery items). | ||||
| 
 | ||||
|     We use a prepared statement and execute_batch() from | ||||
|     psycopg2.extras to improve performance. | ||||
|     """ | ||||
|     rows = [] | ||||
|     for delivery in deliveries: | ||||
|         # get values for all fields of the table | ||||
|         field_values: List[Any] = [] | ||||
|         t = delivery.get('t') | ||||
|         delivery['t_i'] = t | ||||
|         delivery['t_f'] = t | ||||
|         for field in table_fields[table_name]: | ||||
|             if field in delivery: | ||||
|                 if field == 'messages': | ||||
|                     field_values.append(json.dumps(delivery[field])) | ||||
|                 else: | ||||
|                     field_values.append(delivery[field]) | ||||
|             else: | ||||
|                 field_values.append(None) | ||||
|         rows.append(field_values) | ||||
|     sql = get_sql_execute_prepared_statement(table_name) | ||||
|     try: | ||||
|         psycopg2.extras.execute_batch(cursor, sql, rows) | ||||
|     except Exception as err: | ||||
|         msg = f'SQL statement failed: "{sql}" -- the values were: {rows}' | ||||
|         journal.send(msg, PRIORITY=journal.LOG_ERR) | ||||
| 
 | ||||
| 
 | ||||
| def init_db(config: dict) -> Optional[str]: | ||||
|     """ | ||||
|     Initialize database; if ok return DSN, else None. | ||||
| 
 | ||||
|     Try to get parameters for database access, | ||||
|     check existence of tables and possibly create them. | ||||
|     """ | ||||
|     dsn = _get_dsn(config) | ||||
|     if dsn: | ||||
|         ok = _create_tables(dsn) | ||||
|         if not ok: | ||||
|             return None | ||||
|     return dsn | ||||
| 
 | ||||
| 
 | ||||
| def _get_dsn(config: dict) -> Optional[str]: | ||||
|     """ | ||||
|     Return the DSN (data source name) from the *config*. | ||||
|     """ | ||||
|     try: | ||||
|         postgresql_config = config['postgresql'] | ||||
|         hostname = postgresql_config['hostname'] | ||||
|         port = postgresql_config['port'] | ||||
|         database = postgresql_config['database'] | ||||
|         username = postgresql_config['username'] | ||||
|         password = postgresql_config['password'] | ||||
|     except Exception: | ||||
|         msg = f"""ERROR: invalid config in {settings.main_config_file} | ||||
| The config file must contain a section like this: | ||||
| 
 | ||||
| postgresql: | ||||
|     hostname: <HOSTNAME_OR_IP> | ||||
|     port: <PORT> | ||||
|     database: <DATABASE_NAME> | ||||
|     username: <USERNAME> | ||||
|     password: <PASSWORD> | ||||
| """ | ||||
|         journal.send(msg, PRIORITY=journal.LOG_CRIT) | ||||
|         return None | ||||
|     dsn = f'host={hostname} port={port} dbname={database} '\ | ||||
|           f'user={username} password={password}' | ||||
|     return dsn | ||||
| 
 | ||||
| 
 | ||||
| def _create_tables(dsn: str) -> bool: | ||||
|     """ | ||||
|     Check existence of tables and possibly create them, returning success. | ||||
|     """ | ||||
|     try: | ||||
|         with psycopg2.connect(dsn) as conn: | ||||
|             with conn.cursor() as curs: | ||||
|                 for table_name, sql_stmts in get_create_table_stmts().items(): | ||||
|                     ok = _create_table(curs, table_name, sql_stmts) | ||||
|                     if not ok: | ||||
|                         return False | ||||
|     except Exception: | ||||
|         journal.send( | ||||
|             f'ERROR: cannot connect to database, check params' | ||||
|             f' in {settings.main_config_file}', | ||||
|             PRIORITY=journal.LOG_CRIT, | ||||
|         ) | ||||
|         return False | ||||
|     return True | ||||
| 
 | ||||
| 
 | ||||
| def _create_table( | ||||
|     cursor: psycopg2.extras.RealDictCursor, | ||||
|     table_name: str, | ||||
|     sql_stmts: List[str] | ||||
| ) -> bool: | ||||
|     """ | ||||
|     Try to create a table if it does not exist and return whether it exists. | ||||
| 
 | ||||
|     If creation failed, emit an error to the journal. | ||||
|     """ | ||||
|     cursor.execute("SELECT EXISTS(SELECT * FROM " | ||||
|                    "information_schema.tables WHERE table_name=%s)", | ||||
|                    (table_name,)) | ||||
|     table_exists = cursor.fetchone()[0] | ||||
|     if not table_exists: | ||||
|         for sql_stmt in sql_stmts: | ||||
|             try: | ||||
|                 cursor.execute(sql_stmt) | ||||
|             except Exception: | ||||
|                 journal.send( | ||||
|                     'ERROR: database user needs privilege to create tables.\n' | ||||
|                     'Alternatively, you can create the table manually like' | ||||
|                     ' this:\n\n' | ||||
|                     + '\n'.join([sql + ';' for sql in sql_stmts]), | ||||
|                     PRIORITY=journal.LOG_CRIT, | ||||
|                 ) | ||||
|                 return False | ||||
|     return True | ||||
| 
 | ||||
| 
 | ||||
| def init_session(cursor: psycopg2.extras.RealDictCursor) -> None: | ||||
|     """ | ||||
|     Init a database session. | ||||
| 
 | ||||
|     Define prepared statements. | ||||
|     """ | ||||
|     stmt = get_sql_prepared_statement('delivery_from') | ||||
|     cursor.execute(stmt) | ||||
|     stmt = get_sql_prepared_statement('delivery_to') | ||||
|     cursor.execute(stmt) | ||||
|     stmt = get_sql_prepared_statement('noqueue') | ||||
|     cursor.execute(stmt) | ||||
							
								
								
									
										210
									
								
								journal-postfix/files/srv/storage_setup.py
									
										
									
									
									
										Executable file
									
								
							
							
						
						
									
										210
									
								
								journal-postfix/files/srv/storage_setup.py
									
										
									
									
									
										Executable file
									
								
							|  | @ -0,0 +1,210 @@ | |||
| #!/usr/bin/env python3 | ||||
| 
 | ||||
| """ | ||||
| Database table definitions and prepared statements. | ||||
| 
 | ||||
| Note: (short) postfix queue IDs are not unique: | ||||
| http://postfix.1071664.n5.nabble.com/Queue-ID-gets-reused-Not-unique-td25387.html | ||||
| """ | ||||
| 
 | ||||
| from typing import Dict, List | ||||
| 
 | ||||
| 
 | ||||
| _table_def_delivery_from = [ | ||||
|     [ | ||||
|         dict(name='t_i', dtype='TIMESTAMP'), | ||||
|         dict(name='t_f', dtype='TIMESTAMP'), | ||||
|         dict(name='queue_id', dtype='VARCHAR(16)', null=False, extra='UNIQUE'), | ||||
|         dict(name='host', dtype='VARCHAR(200)'), | ||||
|         dict(name='ip', dtype='VARCHAR(50)'), | ||||
|         dict(name='sasl_username', dtype='VARCHAR(300)'), | ||||
|         dict(name='orig_queue_id', dtype='VARCHAR(16)'), | ||||
|         dict(name='status', dtype='VARCHAR(10)'), | ||||
|         dict(name='accepted', dtype='BOOL', null=False, default='TRUE'), | ||||
|         dict(name='done', dtype='BOOL', null=False, default='FALSE'), | ||||
|         dict(name='sender', dtype='VARCHAR(300)'), | ||||
|         dict(name='message_id', dtype='VARCHAR(1000)'), | ||||
|         dict(name='resent_message_id', dtype='VARCHAR(1000)'), | ||||
|         dict(name='subject', dtype='VARCHAR(1000)'), | ||||
|         dict(name='phase', dtype='VARCHAR(15)'), | ||||
|         dict(name='error', dtype='VARCHAR(1000)'), | ||||
|         dict(name='size', dtype='INT'), | ||||
|         dict(name='nrcpt', dtype='INT'), | ||||
|         dict(name='verp_id', dtype='INT'), | ||||
|         dict(name='messages', dtype='JSONB', null=False, default="'{}'::JSONB"), | ||||
|     ], | ||||
|     "CREATE INDEX delivery_from__queue_id ON delivery_from (queue_id)", | ||||
|     "CREATE INDEX delivery_from__t_i ON delivery_from (t_i)", | ||||
|     "CREATE INDEX delivery_from__t_f ON delivery_from (t_f)", | ||||
|     "CREATE INDEX delivery_from__sender ON delivery_from (sender)", | ||||
|     "CREATE INDEX delivery_from__message_id ON delivery_from (message_id)", | ||||
| ] | ||||
| 
 | ||||
| 
 | ||||
| _table_def_delivery_to = [ | ||||
|     [ | ||||
|         dict(name='t_i', dtype='TIMESTAMP'), | ||||
|         dict(name='t_f', dtype='TIMESTAMP'), | ||||
|         dict(name='queue_id', dtype='VARCHAR(16)', null=False), | ||||
|         dict(name='recipient', dtype='VARCHAR(300)'), | ||||
|         dict(name='orig_recipient', dtype='VARCHAR(300)'), | ||||
|         dict(name='host', dtype='VARCHAR(200)'), | ||||
|         dict(name='ip', dtype='VARCHAR(50)'), | ||||
|         dict(name='port', dtype='VARCHAR(10)'), | ||||
|         dict(name='relay', dtype='VARCHAR(10)'), | ||||
|         dict(name='delay', dtype='VARCHAR(200)'), | ||||
|         dict(name='delays', dtype='VARCHAR(200)'), | ||||
|         dict(name='dsn', dtype='VARCHAR(10)'), | ||||
|         dict(name='status', dtype='VARCHAR(10)'), | ||||
|         dict(name='status_text', dtype='VARCHAR(1000)'), | ||||
|         dict(name='messages', dtype='JSONB', null=False, default="'{}'::JSONB"), | ||||
|     ], | ||||
|     "ALTER TABLE delivery_to ADD CONSTRAINT" | ||||
|     " delivery_to__queue_id_recipient UNIQUE(queue_id, recipient)", | ||||
|     "CREATE INDEX delivery_to__queue_id ON delivery_to (queue_id)", | ||||
|     "CREATE INDEX delivery_to__recipient ON delivery_to (recipient)", | ||||
|     "CREATE INDEX delivery_to__t_i ON delivery_to (t_i)", | ||||
|     "CREATE INDEX delivery_to__t_f ON delivery_to (t_f)", | ||||
| ] | ||||
| 
 | ||||
| 
 | ||||
| _table_def_noqueue = [ | ||||
|     [ | ||||
|         dict(name='t', dtype='TIMESTAMP'), | ||||
|         dict(name='host', dtype='VARCHAR(200)'), | ||||
|         dict(name='ip', dtype='VARCHAR(50)'), | ||||
|         dict(name='sender', dtype='VARCHAR(300)'), | ||||
|         dict(name='recipient', dtype='VARCHAR(300)'), | ||||
|         dict(name='sasl_username', dtype='VARCHAR(300)'), | ||||
|         dict(name='status', dtype='VARCHAR(10)'), | ||||
|         dict(name='phase', dtype='VARCHAR(15)'), | ||||
|         dict(name='error', dtype='VARCHAR(1000)'), | ||||
|         dict(name='message', dtype='TEXT'), | ||||
|     ], | ||||
|     "CREATE INDEX noqueue__t ON noqueue (t)", | ||||
|     "CREATE INDEX noqueue__sender ON noqueue (sender)", | ||||
|     "CREATE INDEX noqueue__recipient ON noqueue (recipient)", | ||||
| ] | ||||
| 
 | ||||
| 
 | ||||
| _tables: Dict[str, list] = { | ||||
|     'delivery_from': _table_def_delivery_from, | ||||
|     'delivery_to': _table_def_delivery_to, | ||||
|     'noqueue': _table_def_noqueue, | ||||
| } | ||||
| 
 | ||||
| 
 | ||||
| _prepared_statements = { | ||||
|     'delivery_from': | ||||
|         "PREPARE delivery_from_insert ({}) AS " | ||||
|         "INSERT INTO delivery_from ({}) VALUES ({}) " | ||||
|         "ON CONFLICT (queue_id) DO UPDATE SET {}", | ||||
|     'delivery_to': | ||||
|         "PREPARE delivery_to_insert ({}) AS " | ||||
|         "INSERT INTO delivery_to ({}) VALUES ({}) " | ||||
|         "ON CONFLICT (queue_id, recipient) DO UPDATE SET {}", | ||||
|     'noqueue': | ||||
|         "PREPARE noqueue_insert ({}) AS " | ||||
|         "INSERT INTO noqueue ({}) VALUES ({}){}", | ||||
| } | ||||
| 
 | ||||
| 
 | ||||
| table_fields: Dict[str, List[str]] = {} | ||||
| """ | ||||
| Lists of field names for tables, populated by get_create_table_stmts(). | ||||
| """ | ||||
| 
 | ||||
| 
 | ||||
| def get_sql_prepared_statement(table_name: str) -> str: | ||||
|     """ | ||||
|     Return SQL defining a prepared statement for inserting into a table. | ||||
| 
 | ||||
|     Table 'noqueue' is handled differently, because it does not have | ||||
|     an UPDATE clause. | ||||
|     """ | ||||
|     col_names = [] | ||||
|     col_types = [] | ||||
|     col_args = [] | ||||
|     col_upds = [] | ||||
|     col_i = 0 | ||||
|     for field in _tables[table_name][0]: | ||||
|         # column type | ||||
|         col_type = field['dtype'] | ||||
|         if field['dtype'].lower().startswith('varchar'): | ||||
|             col_type = 'TEXT' | ||||
|         col_types.append(col_type) | ||||
|         # column args | ||||
|         col_i += 1 | ||||
|         col_arg = '$' + str(col_i) | ||||
|         # column name | ||||
|         col_name = field['name'] | ||||
|         col_names.append(col_name) | ||||
|         if 'default' in field: | ||||
|             default = field['default'] | ||||
|             col_args.append(f'COALESCE({col_arg},{default})') | ||||
|         else: | ||||
|             col_args.append(col_arg) | ||||
|         # column update | ||||
|         col_upd = f'{col_name}=COALESCE({col_arg},{table_name}.{col_name})' | ||||
|         if col_name != 't_i': | ||||
|             if col_name == 'messages': | ||||
|                 col_upd = f'{col_name}={table_name}.{col_name}||{col_arg}' | ||||
|             if table_name != 'noqueue': | ||||
|                 col_upds.append(col_upd) | ||||
|     stmt = _prepared_statements[table_name].format( | ||||
|         ','.join(col_types), | ||||
|         ','.join(col_names), | ||||
|         ','.join(col_args), | ||||
|         ','.join(col_upds), | ||||
|     ) | ||||
|     return stmt | ||||
| 
 | ||||
| 
 | ||||
| def get_sql_execute_prepared_statement(table_name: str) -> str: | ||||
|     """ | ||||
|     Return SQL for executing the given table's prepared statement. | ||||
| 
 | ||||
|     The result is based on global variable _tables. | ||||
|     """ | ||||
|     fields = _tables[table_name][0] | ||||
|     return "EXECUTE {}_insert ({})"\ | ||||
|         .format(table_name, ','.join(['%s' for i in range(len(fields))])) | ||||
| 
 | ||||
| 
 | ||||
| def get_create_table_stmts() -> Dict[str, List[str]]: | ||||
|     """ | ||||
|     Return a dict mapping table names to SQL statements creating the tables. | ||||
| 
 | ||||
|     Also populate global variable table_fields. | ||||
|     """ | ||||
|     res = {} | ||||
|     for table_name, table_def in _tables.items(): | ||||
|         stmts = table_def.copy() | ||||
|         stmts[0] = _get_sql_create_stmt(table_name, table_def[0]) | ||||
|         res[table_name] = stmts | ||||
|         field_names = [x['name'] for x in table_def[0]] | ||||
|         global table_fields | ||||
|         table_fields[table_name] = field_names | ||||
|     return res | ||||
| 
 | ||||
| 
 | ||||
| def _get_sql_create_stmt(table_name: str, fields: List[dict]): | ||||
|     """ | ||||
|     Return the 'CREATE TABLE' SQL statement for a table. | ||||
| 
 | ||||
|     Factor in NULL, DEFAULT and extra DDL text. | ||||
|     """ | ||||
|     sql = f"CREATE TABLE {table_name} (\n    id BIGSERIAL," | ||||
|     col_defs = [] | ||||
|     for field in fields: | ||||
|         col_def = f"    {field['name']} {field['dtype']}" | ||||
|         if 'null' in field and field['null'] is False: | ||||
|             col_def += " NOT NULL" | ||||
|         if 'default' in field: | ||||
|             col_def += f" DEFAULT {field['default']}" | ||||
|         if 'extra' in field: | ||||
|             col_def += f" {field['extra']}" | ||||
|         col_defs.append(col_def) | ||||
|     sql += '\n' + ',\n'.join(col_defs) | ||||
|     sql += '\n)' | ||||
|     return sql | ||||
							
								
								
									
										90
									
								
								journal-postfix/tasks/main.yml
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										90
									
								
								journal-postfix/tasks/main.yml
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,90 @@ | |||
| - name: user journal-postfix | ||||
|   user: | ||||
|     name: journal-postfix | ||||
|     group: systemd-journal | ||||
|     state: present | ||||
|     system: yes | ||||
|     uid: 420 | ||||
|     create_home: no | ||||
|     home: /srv/journal-postfix | ||||
|     password: '!' | ||||
|     password_lock: yes | ||||
|     comment: created by ansible role journal-postfix | ||||
| 
 | ||||
| - name: directories /srv/journal-postfix, /etc/journal-postfix | ||||
|   file: | ||||
|     path: "{{ item }}" | ||||
|     state: directory | ||||
|     owner: journal-postfix | ||||
|     group: systemd-journal | ||||
|     mode: 0755 | ||||
|   loop: | ||||
|     - /srv/journal-postfix | ||||
|     - /etc/journal-postfix | ||||
| 
 | ||||
| - name: install dependencies | ||||
|   apt: | ||||
|     name: python3-psycopg2,python3-systemd,python3-yaml | ||||
|     state: present | ||||
|     update_cache: yes | ||||
|     install_recommends: no | ||||
| 
 | ||||
| - name: files in /srv/journal-postfix | ||||
|   copy: | ||||
|     src: "srv/{{ item }}" | ||||
|     dest: "/srv/journal-postfix/{{ item }}" | ||||
|     owner: journal-postfix | ||||
|     group: systemd-journal | ||||
|     mode: 0644 | ||||
|     force: yes | ||||
|   loop: | ||||
|     - run.py | ||||
|     - settings.py | ||||
|     - sources.py | ||||
|     - parser.py | ||||
|     - storage.py | ||||
|     - storage_setup.py | ||||
|     - README.md | ||||
|     - setup.cfg | ||||
| 
 | ||||
| - name: make some files executable | ||||
|   file: | ||||
|     path: "{{ item }}" | ||||
|     mode: 0755 | ||||
|   loop: | ||||
|     - /srv/journal-postfix/run.py | ||||
|     - /srv/journal-postfix/settings.py | ||||
| 
 | ||||
| - name: determine whether to startup | ||||
|   set_fact: | ||||
|     startup: "{{ mailserver.postgresql.host is defined and mailserver.postgresql.port is defined and mailserver.postgresql.dbname is defined and mailserver.postgresql.username is defined and mailserver.postgresql.password is defined }}" | ||||
| 
 | ||||
| - name: file /etc/journal-postfix/main.yml | ||||
|   template: | ||||
|     src: main.yml | ||||
|     dest: /etc/journal-postfix/main.yml | ||||
|     owner: journal-postfix | ||||
|     group: systemd-journal | ||||
|     mode: 0600 | ||||
|     force: no | ||||
| 
 | ||||
| - name: file journal-postfix.service | ||||
|   copy: | ||||
|     src: journal-postfix.service | ||||
|     dest: /etc/systemd/system/journal-postfix.service | ||||
|     owner: root | ||||
|     group: root | ||||
|     mode: 0644 | ||||
|     force: yes | ||||
| 
 | ||||
| - name: enable systemd unit journal-postfix.service | ||||
|   systemd: | ||||
|     enabled: yes | ||||
|     daemon_reload: yes | ||||
|     name: journal-postfix.service | ||||
| 
 | ||||
| - name: restart systemd unit journal-postfix.service | ||||
|   systemd: | ||||
|     state: restarted | ||||
|     name: journal-postfix.service | ||||
|   when: startup | ||||
							
								
								
									
										45
									
								
								journal-postfix/templates/main.yml
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										45
									
								
								journal-postfix/templates/main.yml
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,45 @@ | |||
| # Configuration for journal-postfix, see /srv/journal-postfix | ||||
| 
 | ||||
| # To enable startup of systemd unit journal-postfix set this to yes: | ||||
| startup: {{ 'yes' if startup else 'no' }} | ||||
| 
 | ||||
| # PostgreSQL database connection parameters | ||||
| postgresql: | ||||
|     hostname: {{ mailserver.postgresql.host | default('127.0.0.1') }} | ||||
|     port: {{ mailserver.postgresql.port | default('5432') }} | ||||
|     database: {{ mailserver.postgresql.dbname | default('mailserver') }} | ||||
|     username: {{ mailserver.postgresql.username | default('mailserver') }} | ||||
|     password: {{ mailserver.postgresql.password | default('*************') }} | ||||
| 
 | ||||
| # Postfix parameters | ||||
| postfix: | ||||
|     # Systemd unit name of the Postfix unit. Only one unit is supported. | ||||
|     systemd_unitname: postfix@-.service | ||||
| 
 | ||||
|     # If you have configured Postfix to rewrite envelope sender | ||||
|     # addresses of outgoing mails so that it includes a VERP | ||||
|     # (Variable Envelope Return Path) of the form | ||||
|     # {local_part}+{verp_marker}-{id}@{domain}, where id is an | ||||
|     # integer, then set the verp_marker here: | ||||
|     verp_marker: {{ mailserver.postfix.verp_marker | default('') }} | ||||
| 
 | ||||
| # Poll timeout in seconds for fetching messages from the journal. | ||||
| journal_poll_interval: 10.0 | ||||
| 
 | ||||
| # How much time may pass before committing a database transaction? | ||||
| # (The actual maximal delay can be one journal_poll_interval in addition.) | ||||
| max_delay_before_commit: 60.0 | ||||
| 
 | ||||
| # How many messages to cache at most before committing a database transaction? | ||||
| max_messages_per_commit: 10000 | ||||
| 
 | ||||
| # Delete delivery records older than this number of days. | ||||
| # A value of 0 means that data are never deleted. | ||||
| # Note: Deliveries may have a substantial time intervals over which they | ||||
| # are active; here the age of a delivery is determined by its start time. | ||||
| delete_deliveries_after_days: 30 | ||||
| 
 | ||||
| # The time interval in seconds after which a deletion of old | ||||
| # delivery records is triggered. (Will not be smaller than | ||||
| # max_delay_before_commit + journal_poll_interval.) | ||||
| delete_interval: 3600 | ||||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue