Amazon S3 Static WebSites and AWStats

I believe that http://www.s3stat.com/ is a very good service to handle this. There are other circumstances, however, that doing things yourself is the only option. 

I was amazed to see the there is not much info about this on the web at this moment, so I decided to write my experience.

Reasons for this approach: 
- what's out there on AWStats and AWS is pretty out-dated (log files naming, format, etc).
- no one comments about s3cmd, which is IMO a very good tool for communicating with AWS through command line.

Download the log files:

#
# you will need to setup your credentials in a s3.config file
# and the example here extracts all logs for Dec/2012
#
$ s3cmd -c s3.config get s3://{bucket}/log/access_log-2012-12*

Note: for 1GB (14k files) of data, this took something like 10hs… :-(

Handle Log files 

As of this day, the log files are formatted as:

access_log-YYYY-MM-DD-HH-MM-SS-XXXXXXXXXXXXXX

First thing is to compact them into a single file:

$ logresolvemerge.pl access_log-* > access_log.log

or daily files, in which you could build a script:

prefix="access_log-2012-12"
path=/tmp/s3-logs/raw-logs
dest=/tmp/s3-logs/fixed-logs

for d in {01..31}; do 
    check_prefix=$path/$prefix-$d
    if [ ! `ls $check_prefix* 2>/dev/null | wc -l` -eq 0 ]; then
        logresolvemerge.pl $check_prefix* > $dest/$prefix-$d.log
    fi
done

This should output logs in the format access_log-2012-12-01.log, and so on.

Note: look for the location of logresolvemerge.pl. Usually it is in: /usr/share/awstats/tools/logresolvemerge.pl
 

Configure AWStats

You will need to create a config file, following the examples already available (in this example: my.awstats.config). Change the following details:

# this for all files
LogFile="/tmp/s3-logs/fixed-logs/access_log.log"

# or to process a daily file:
LogFile="/tmp/s3-logs/fixed-logs/access_log-%YYYY-0-%MM-0-%DD-0.log"

LogFormat = "%other %extra1 %time1 %host %logname %other %extra2 %url %methodurl %code %other %bytesd %other %other %other %refererquot %uaquot %other"

#
# an any other you need…
#

I think it is unlikely that they will add Amazon format into AWStats by default, but there is an official request for that here: http://sourceforge.net/p/awstats/feature-requests/864/

Generate the reports

In my case I just needed a single month report to print in PDF, so I did:

$ sudo /usr/lib/cgi-bin/awstats.pl -config="my.awstats.config" -staticlinks -output -year=2012 -month=12 > ./2012-12.html

Since I don't want to serve the AWStats live on a web browser, I did a hack to make the images available, just so I could print the PDF out of the browser:

$ sed -i "s//awstats-icon//http://people.ofset.org/awstats-icon//g" ./2012-12.html 

(grabbing some public location with access to the awstats-icon directory, thanks ofset.org).

And voila!

20130118-001

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>