On-line statistics of answered emails of support@ and billing@ accounts of Switzernet

 

Emin Gabrielyan

Switzernet

2009-03-09

 

 

This project aims at an online web page displaying the monthly statistics of answered and unanswered support emails. The email numbers are retrieved via an IMAP connection. A limited IMAP protocol exchange is implemented for login, status, list, examine, search, and logout client commands. The numbers of answered and pending emails of billing@ and support@ accounts are updated on hourly basis:

[show]

 

The exchange is implemented within a Perl script. The script and its report files are stored on mirror web sites. The monitoring server downloads and runs the code periodically. The code changes must be registered only in the ‘code’ repository folder on web servers. The monitoring server downloads the updated version of the script from the code repository before running a new report.

[open]

 

As a permanent code, the monitoring server has only a small bash script [sh.txt] responsible for the downloading and running the Perl code. This bash script is entered in the crontab. For running the reports on another monitoring server, only installation of the short bash script is required.

 

openser3:~/folders/090309-imap-stats# crontab -l

# m h  dom mon dow   command

...

7,27,47 * * * * /root/folders/090309-imap-stats/report.sh > /dev/null

openser3:~/folders/090309-imap-stats#

 

Once the short bash script downloads and unzips the script, the control is passed to the Perl code. It is then the responsibility of the Perl script to connect to IMAP servers, collect the statistics, generate the report, and upload the reports back to the web servers. We present here the version of the Perl code [a41.pl.txt] corresponding to the date of documentation. You should refer to the code repository [open] for the last version of the Perl script. The code repository will contain also updates of the server side downloading script (if any).

 

Description of the Perl script

 

Here we describe in details the code of the Perl script as of the creation date of this document [a41.pl.txt]. For more recent versions check the code repository [open].

 

Do not attempt to understand the IMAP commands from the Perl code. Instead refer to [rfc2060] or to its updated version [rfc3501]. All pages references are however based only on the old version of RFC 2060.

 

Perl code

Description

#!/usr/bin/perl

#Copyright (c) 2009 by Emin Gabrielyan, Switzernet

$ver="a41.pl.txt";

 

use IO::Socket;

use Net::FTP;

 

$show=0;

$log=3;

$empx=0.5;

$emmi=20;

If parameter ‘$show’ is set to 1, the script will print on its standard output all data exchanged via the IMAP connection.

 

Parameter ‘$log’ is an additional logging level (from 0 to 3).

$support_pwd_file=<pwd/support*.txt> or $support_pwd_file=<~/files/090309-imap-support*.txt> or die "no pwd";

$billing_pwd_file=<pwd/billing*.txt> or $billing_pwd_file=<~/files/090309-imap-billing*.txt> or die "no pwd";

$unappel_pwd_file=<pwd/unappel*.txt> or $unappel_pwd_file=<~/files/070930-unappel-ftplogin*.txt> or die "no pwd";

$switzer_pwd_file=<pwd/switzernet*.txt> or $switzer_pwd_file=<~/files/070704-switzernet-ftplogin*.txt> or die "no pwd";

We attempt to localize the files containing the password first in a local “pwd” folder.

 

If files are not found in a local “pwd” folder, we attempt to find them in “~/files” folder.

 

We use Perl’s filename globbing [more]. The globbing will return an empty string for non-existent file only if we use a wildcard character (even if the precise filename is known). Therefore do not remove the wildcard character ‘*’ from the filenames.

$unappel_ftp_dir="/htdocs/public/090309-support-answers";

$switzer_ftp_dir="/public/090309-support-answers";

 

open $f,$support_pwd_file or die "pwd fails";

$support_pwd=<$f>; $support_pwd=~s/[\r\n]*//g;

close $f;

 

open $f,$billing_pwd_file or die "pwd fails";

$billing_pwd=<$f>; $billing_pwd=~s/[\r\n]*//g;

close $f;

 

open $f,$unappel_pwd_file or die "pwd fails";

$unappel_pwd=<$f>; $unappel_pwd=~s/[\r\n]*//g;

close $f;

 

open $f,$switzer_pwd_file or die "pwd fails";

$switzer_pwd=<$f>; $switzer_pwd=~s/[\r\n]*//g;

close $f;

We read the passwords from files into corresponding variables.

 

Note that if files contain multiple lines, only first lines will be read.

sub account_stat{

  $m1=$period[0]*12+$period[1]-1;

  $m2=$period[2]*12+$period[3]-1;

  @mmm=qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

  for(my $m=$m1;$m<=$m2;$m++)

  {

    $months[$m-$m1]="since 1-".$mmm[$m%12]."-".int($m/12)." before 1-".$mmm[($m+1)%12]."-".int(($m+1)/12)

  }

We start the definition of the subroutine responsible for the IMAP connection.

 

This is the most important subroutine of the script.

 

In this subroutine we connect to the IMAP server, login and invoke the ‘list’ commands, select the retrieved folders one by one, invoke in each folder several search commands according to our search criteria, collect the results, and store them into an associative array for further processing.

 

This subroutine is called from the ‘chart’ subroutine. Global variables ‘@period’, ‘$account’, and ‘$pass’ must be defined before calling the ‘account_stat’ subroutine.

 

‘@period’ is an array representing the months, and the following is an example of its value ‘(2009,1,2009,3)’.

 

The array ‘@months’ is loaded with strings representing the concerned months according to the format of arguments of IMAP ‘search’ command [rfc2060 pp.39-40].

  $timeout=90;

  $base="search undeleted not from \@switzernet.com";

 

  $c=IO::Socket::INET->new(Proto=>"tcp",PeerAddr=>"switzernet.com",PeerPort=>143);

  die "cannot connect" unless($c);

  

  local $SIG{ALRM} = sub { die "Server timeout" };

 

  alarm($timeout);

‘$timeout’ is the time in seconds the client side shall wait for the expected reply from the server.

 

Note that we always expect the ‘<tag> OK’ server results for all commands of this application.

 

In our implementation, the ‘<tag> BAD’ and ‘<tag> NO’ server results are ignored, and if such result occurs, we “realize” the lack of ‘<tag> OK’ only upon the expiration of the timeout delay. See more [rfc2060 p.7].

 

Such a situation may occur in case of wrong login password, but the script will be aware of failure only upon the timeout.

 

The variable ‘$base’ stores the common part of criteria applied to all search commands of this application [rfc2060 p.38].

 

Variable ‘$c’ contains the input/output communication handler with the IMAP port of the server. See more on Perl TCP sockets [go].

 

Timeout always causes end of the script. The timer is launched by ‘alarm’ function. Note that by invoking a new ‘alarm’ we always reset the previously running but not yet expired timer.

  while(<$c>) {print if($show); last if(/^\* OK/)}

We expect here the first untagged OK response of the server (i.e. the ‘* OK’ line) confirming the successful establishment of the IMAP connection.

 

You can run a simple shell script for a demonstration of a short IMAP dialog consisting of a login, list, and a logout [sh.txt].

 

A better way for implementing such a simple network dialog is to use the ‘expect’ tool [more]. Check a corresponding demo script using the ‘expect’ tool [expect.sh.txt].

 

Both script assumes that ‘../pwd/support-pass.txt’ file contains the password.

  local $i=1000;

Here is our tag counter. The tag counter is ‘global’ during the entire duration of a connection session. We can have several dialogs (short exchanges of commands), but the tag counter must keep increasing until the end of the IMAP connection session.

  sub dialog{

    foreach $cmd (@seq)

    {

      $i++;

      $send=$i." ".$cmd."\n";

      $expect=$i." OK";

      print $send if($show);

      print $c $send;

      while(<$c>){

        print if($show);

        $messages=$1 if(/^\* STATUS \"[^\"]*\" \(MESSAGES ([\d]+)\)/i);

        $search{$cmd}+=split/ +/,$1 if(/^\* SEARCH(?: +(\d|\d[\d ]*\d))?[^\d]*$/);

        push(@list,$1) if(/^\* LIST \([^)]*\) \"[^\"]*\" (\"[^\"]*\")[\r\n]*$/);

        last if(/$expect/)

      }

    }

  }

‘dialog’ is our core subroutine. It assumes a global array ‘@seq’ which must contain the sequence of commands to be sent to the server.

 

For each command we increment the tag counter, we prepare the tagged transmission string and we prepare the last tagged result string expected from the server as a confirmation of the (successful) end of the command.

 

We invoke the transmission string and we scan the input from the server until the last successful accomplishment result string is received.

 

During reception, we collect (a) the result of the status command (number of messages in mailbox), (b) the results of the search command (number of messages matching the search criteria), and the result of the list command.

 

For the output of the status command, we simply store the number of messages into the ‘$messages’ global variable for its further use by the rest of the script.

 

For the output of the search command we compute the number of messages (see the split Perl function and usage of an array in scalar mode). Then we add this number to the associative array indexed by all search commands being invoked during the session.

 

So all searches (across different folders) for January will be summed up under a singly January search, all searches for February will be summed up under a single February search, etc.

 

For the output of the list commands, we retrieve the IMAP mailbox full names (IMAP folder names) and we store them it in the ‘@list’ array for its further use by the rest of the script (e.g. for individual ‘examine’ commands).

  @seq=("noop", "login $account\@switzernet.com $pass", "noop", "status inbox (messages)", "noop");

  &dialog;

  print "$account has $messages unclassified messages\n" if ($log>=1);

Here is the first use of the ‘&dialog’ subroutine. We login, we invoke the ‘status inbox’ command, and we display (if the log level is appropriate) the number of messages in inbox.

 

Note that here we count also the messages marked as deleted. The status command therefore must be replaced by an appropriate ‘search’ command. Such a change is noted our current to-do list.

  for($class=1;$class<=3;$class++)

  {

    printf "Class %d folders..\n",$class if($log>=1);

    alarm($timeout);

    @list=();

    %search=();

    @seq=("noop", "list inbox. 0$class%.*", "noop");

    &dialog;

Here is the loop for scanning the folders of the 1st, 2nd, and 3rd classes.

 

For billing and support accounts we defined three folder classes for three different groups of questions. The questions are manually distributed across the classes by our support team. This Perl application monitors the movement of the emails.

 

For the billing account the following class folders are used ‘01 Explain’, ‘02 Cash’ and ‘03 Account’.

 

For the support account the class folders are ‘01 Reception’, ’02 Phone’, and ’03 Service’.

 

We prepare a sequence of commands for retrieving the list of all subfolders of each class.

 

We set a timeout for the retrieval of the list of subfolders and their further processing. The next timeout is set only for processing of the next class at the next iteration. Therefore the delay counter is running until the end of processing of all sub-folders of a class.

 

The list array (storing the names of subfolders) is emptied. The search associative array (storing the search statistics) is emptied as well.

 

In the ‘list’ IMAP command, the character ‘*’ is a wildcard and matches zero or more characters. The character ‘%’ is similar to ‘*’, but it does not match a hierarch delimiter ‘.’. We therefore limit ourselves by only subfolders of the class folder, excluding the class folder itself. See more [rfc2060 p.31].

 

Since for both accounts the classes are prefixed by ‘01’, ‘02’, and ‘03’, the ‘list’ command is universal for both accounts.

 

An IMAP communication exchange up to this point can be carried out by a much simpler shell script [sh.txt] or a similar script using an ‘expect’ tool [expect.sh.txt].

    @folders=@list;

    foreach(@folders)

    {

      print $_."\n" if($log>=2);

      @seq=("examine $_");

      foreach $mo (@months)

      {

        push(@seq,"$base $mo $_") foreach("answered","unanswered")

      }

      &dialog;

    }

The previous dialog collects in the array ‘@list’ the names of all subfolders of the current class.

 

We now loop across these subfolders. We need to copy the ‘@list’ array into ‘@folders’. We shall not loop directly over ‘@list’ array because it can be modified by new dialogs in course of the loop (a proper choice of variable scopes and parsing methods shall free us from such discussions).

 

For each subfolder we create a new dialog sequence. We start with an examine command for selecting the IMAP folder in the read-only mode [rfc2060 p.24].

 

We then add a sequence of search commands. For each month of the input period, we prepare two commands, retrieving (a) the answered and (b) the pending emails of customers (not from @switzernet.com).

 

Upon the execution of each ‘&dialog’ subroutine the numbers of found messages of every subfolder are summed up with the numbers of other folders. Summation is made by identical search criteria (based on the month and the answer status) and not by subfolder.

    if($log>=3){printf "%5d %s\n",$search{$_},$_ foreach(sort { $a cmp $b } keys %search)}

The search associative array of the current class is displayed (if the log level is appropriate).

 

See more on sort operation of Perl [go].

 

    for(my $m=$m1;$m<=$m2;$m++)

    {

      foreach("answered","unanswered")

      {

        /^(.).*/;

        $stat{"$m $class $1"}=$search{"$base $months[$m-$m1] $_"}

      }

    }

The final output statistics array is formed here. The array has three indexes. First one is the month of the period. The second index is the class (from 1 to 3), and the third index is the answer status (‘a’ for answered or ‘u’ for unanswered).

  }

This is the end of the loop for processing the class folders.

  @seq=("noop", "logout");

  &dialog;

  close $c;

}

End of the subroutine ‘account_stat’. We logout from the IMAP server and close the TCP connection. All data is in the ‘%stat’ associative array.

($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime(time);

$year+=1900;

$mon+=1;

$today=sprintf "%4d-%02d-%02d",$year,$mon,$mday;

$now=sprintf "%s %02d:%02d",$today,$hour,$min;

$datestamp=sprintf "%02d%02d%02d",$year%100,$mon,$mday;

$timestamp=sprintf "%s.%02d%02d%02d",$datestamp,$hour,$min,$sec;

$hh=sprintf "%02d",$hour;

print "started on $timestamp by $ver\n" if($log>=1);

Generating all sorts of time and date stamps.

sub mkreport{

  open $f,">out.htm" or die "cannot create report file";

  print $f <<"NNNN";

<html>

<head>

<link rel="stylesheet" type="text/css" href="styles.css" />

<title>

NNNN

  print $f "Emails of support\@ and billing\@ switezrnet.com as of $now, statistics of answered and waiting messages\r\n";

  print $f <<"NNNN";

</title>

</head>

<body>

NNNN

The ‘mkreport’ subroutine is responsible for creating the HTML file containing the statistics.

 

This subroutine creates the HTML file (with its header and footer) and then for each account (billing and support) it invokes the ‘chart’ subroutine.

 

The ‘chart’ subroutine in its turn is responsible for retrieving the data from the IMAP server by calling the ‘account_stat’, and for creating a monthly bar histogram in the output html file, using the HTML tables.

  sub chart{

    &account_stat;

    print $f "\r\n<h1>$account\@ as of $today</h1>\r\n<p>\r\n";

Here is the ‘chart’ subroutine, which first of all invokes a connection with the server and a retrieval of statistics.

 

Statistics are in the ‘%stat’ associative array.

 

With a heading 1 style we label in the output html file the account name (billing or support) and the date.

    for(my $m=$m1;$m<=$m2;$m++)

    {

      my $mo=sprintf "%s'%02d",$mmm[$m%12],int($m/12)%100;

      print "$account $mo ..\n";

      print $f "\r\n";

      print $f "<table class=bar cellspacing=0><tr align=center>\r\n";

      print $f "<td class=label width=50>$mo</td>\r\n";

For each month an HTML table, visually representing a horizontal histogram bar is to be generated.

 

The histogram bar starts with the month label in the “MMM’YY” format.

 

See the styles giving histogram shapes to HTML tables [htm], [htm.txt], [css.txt].

      my %count;

‘%count’ is an associative array for storing the total number of answered (index by ‘a’) and unanswered (index by ‘u’) emails of the month.

 

Declaration with ‘my’ makes a scratch new counter for each month’s iteration.

      $n=$messages;

      printf $f "<td class=0 width=%.1f>%s</td>\r\n",$n*$empx,($n>=$emmi?$n:"") if($m==$m2);

If we are dealing with the last month of the iteration, we start with an extra bar representing also the total number of fresh, not yet classified, emails of inbox.

 

See the style ‘td.0’ in the CSS file [htm], [htm.txt], [css.txt].

      for($class=1;$class<=3;$class++)

      {

        foreach $ans ("a","u")

        {

          my $n=$stat{"$m $class $ans"};

          printf $f "<td class=$class$ans width=%.1f>%s</td>\r\n",$n*$empx,($n>=$emmi?$n:"");

          $count{$ans}+=$n;

        }

      }

Now for each class (from 1 to 3) we create two horizontal bars with the numbers of answered and unanswered emails (of customers) belonging to the given month.

 

In total, for the 3 classes, we have 6 horizontal histogram pieces.

 

See the style ‘td.1a’, ‘td.1u’, ‘td.2a’, ‘td.2u’, ‘td.3a’, and ‘td.3u’ in the CSS file [htm], [htm.txt], [css.txt].

      printf $f "<td class=label width=30>%d (%d)</td>\r\n",

        $count{"a"}+$count{"u"}+($m==$m2?$messages:0),$count{"u"}+($m==$m2?$messages:0);

      print $f "</tr></table><br>\r\n";

The summarizing labels of the month are printed here (at the right end of the histogram).

    }

    print $f "</p>\r\n";

End of the loop of months (i.e. of horizontal bars).

  }

End of the ‘chart’ subroutine

  @period=(2009,1,$year,$mon);

  $account="billing";

  $pass=$billing_pwd;

  &chart;

 

  @period=(2008,11,$year,$mon);

  $account="support";

  $pass=$support_pwd;

  &chart;

Here we are in the main ‘mkreport’ subroutine and we are calling the ‘&chart’ subroutine twice: for creating the bar histograms of the billing and support accounts respectively.

  print $f <<"NNNN";

<h1>Meaning of fields</h1>

<p>

<table class=bar cellspacing=0>

<tr align=center>

<td class=label width=50>Month</td>

<td class=0 width=30>not Moved</td>

<td class=1a width=80>01 Explain / 01 Reception Answered</td>

<td class=1u width=30>01 Waiting</td>

<td class=2a width=80>02 Cash / 02 Phone Answered</td>

<td class=2u width=30>02 Waiting</td>

<td class=3a width=80>03 Account / 03 Service Answered</td>

<td class=3u width=30>03 Waiting</td>

<td class=label width=30>Total (waiting)</td>

</tr>

</table><br>

</p>

[<a href=../last>last</a>] [<a href=../hour>hours</a>] [<a href=../hist>past</a>]<br>

NNNN

  print $f "Created on $now by $ver<br>\r\n";

  print $f <<"NNNN";

</body>

</html>

NNNN

 

  close $f

}

End of the ‘mkreport’ subroutine.

 

Adding the legend, the time stamp, and closing the output html file.

 

See an example [htm] or the running version of the report file [last].

&mkreport;

Call to the ‘mkreport’ subroutine. Generating the output HTML file (by invoking internally the connections to IMAP severs).

sub upload{

  $ftp = Net::FTP->new($ftp, Debug => 1) or die "Cannot connect to ftp";

 

  $ftp->login($login,$pass) or die "Cannot login ", $ftp->message;

 

  $ftp->cwd($dir."/hist") or die "Cannot change working directory ", $ftp->message;

  $ftp->put("out.htm","$datestamp-stats.htm");

  $ftp->put("styles.css");

 

  $ftp->cwd($dir."/last") or die "Cannot change working directory ", $ftp->message;

  $ftp->put("out.htm","index.htm");

  $ftp->put("styles.css");

 

  $ftp->cwd($dir."/hour") or die "Cannot change working directory ", $ftp->message;

  $ftp->put("out.htm","$hh\h.htm");

  $ftp->put("styles.css");

 

  $ftp->quit();

}

We define the ‘upload’ subroutine, which is responsible for uploading the html files to web servers via ftp. For more references see ftp examples with Perl [go].

 

In each FTP session, we upload three copies of the same file. The copy in the ‘last’ folder of the web server always overwrites the current index file. The copy of the ‘hist’ folder creates or overwrites (if already existing) the daily file. The copy in the ‘hour’ folder overwrites the file of day’s hour (‘01h.htm’, ‘02h.htm’  ... ‘24h.htm’). There is no date prefix, therefore only a trace of last 24 hours is available.

$ftp="unappel.ch";

$login="unappel";

$pass=$unappel_pwd;

$dir=$unappel_ftp_dir;

&upload;

 

$ftp="switzernet.com";

$login="switzernet.com";

$pass=$switzer_pwd;

$dir=$switzer_ftp_dir;

&upload;

Uploading the files to our two mirror web servers.

 

[1] [2]

 

Further work and the current ‘To Do’ list

 

Do not use ‘status’ IMAP command [rfc2060 p33] for counting the emails of inbox, but rely rather on ‘search’ command in order to avoid counting the deleted but not yet expunged messages. Such a situation shall occur when the user classifies the messages by moving them from inbox to subfolders of classes. The inbox will show the original number of messages until the inbox is expunged or the close command is invoked (given that the inbox is selected with the ‘select’ command and not with read-only ‘examine’).

 

Create a navigation frame for showing the pages of last 24 hours.

 

Do not count in inboxes the internal messages with ‘To Do’ blue labels and flagged by an asterisk.

 

Clean the code with respect to the use of local and global variables. This version of the code is quite messy with too many global variables. See more on scope of Perl variables [go].

 

Declare with ‘my’ the ‘$send’ and ‘$expect’ variables of the ‘&dialog’ subroutine.

 

Empty the ‘stat’ associative array at the beginning of the ‘account_stat’ subroutine.

 

Leave thanks to the author of the ftp upload demo [go] with a link to this project.

 

References

 

RFC 2060, all references to page numbers in this document are given with respect to this version of RFC:

http://james.apache.org/server/rfclist/imap4/rfc2060.txt

 

RFC 3501, “INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1”, obsoletes RFC 2060:

http://www.faqs.org/rfcs/rfc3501.html

 

A demonstration of IMAP with telnet – but read the RFC first:

http://bobpeers.com/technical/telnet_imap.php

 

Spawn, send, and expect – a tool ‘talking’ with other interactive programs or across a network – see a demo script [expect.sh.txt]:

http://www.osix.net/modules/article/?id=30

 

Network sockets in Perl ‘TCP Clients with IO::Socket’:

http://www.rocketaware.com/perl/perlipc/TCP_Clients_with_IO_Socket.htm

 

Sorting with Perl

http://www.perlfect.com/articles/sorting.shtml

 

Perl file input output

http://www.troubleshooters.com/codecorn/littperl/perlfile.htm

 

Perl miscellaneous

http://www.perlmonks.org/?node_id=565863

http://www.perl.com/pub/a/2001/01/begperl6.html#strict%20vars

http://www.gnulamp.com/perlloop.html

 

Perl filename Globbing:

http://affy.blogspot.com/p5be/ch09.htm#Globbing

 

Perl variable scopes (difference between “my” and “local”)

http://www.geekinterview.com/question_details/16991

 

HTML table and table element styles:

http://www.somacon.com/p141.php

http://www.w3schools.com/html/html_tables.asp

http://www.w3.org/TR/CSS2/tables.html

http://www.w3schools.com/css/pr_dim_line-height.asp

 

FTP uploading with Perl:

http://www.reviewmylife.co.uk/blog/2008/02/18/ftp-uploading-a-directory-of-files-using-perl/

 

Files

 

A simple shell demo of a telnet connection to an IMAP server [sh.txt]

A simple demo with an IMAP connection using the ‘expect’ tool [sh.txt]

The Perl script of answered email statistics presented in this document [a41.pl.txt]

The shell script running on the monitoring server responsible for a periodic download and execution of the last version of the Perl code [sh.txt]

A sample of the HTML output file [htm] and its code [htm.txt]

The CSS file with styles [css], [css.txt]

A log-file of used references [log.txt]

 

*   *   *

Copyright © 2009, Switzernet