Apache

Calculating bandwidth from a combined-format web server log

Given a combined web server access log, such as the ones generated by Apache, it can be useful to know the total amount of data transfer of all requests in that log. This task is simple: extract the field listing the number of bytes sent for a request, and add them all up. For something so simple, there is an odd lack of examples or pre-made scripts that do this. Or, at least, I couldn’t find any.

I wrote my solution, calculate-data-transfer.py, in Python:

A better way to separate Apache log files by virtual host domains

Apache's "combined" log format is one the most common log formats used in access logging, containing useful fields such as referrer and user agent. Unfortunately, it does not contain a field listing the the virtual host for whom a request was formed. With Apache, this is easily rectified by defining a custom logging format and post-processing logs to maintain compatibility.

A quick domain HTTP permanent redirect script for Apache/PHP

I mentioned earlier that to get my old domains, tamasrepus.hotnudiegirls.com and tamasrepus.rhombic.net delisted from search engines, I was going to write a redirection script that would send an HTTP status 301 (permanent redirect) to anything that tries to access those domains. It would send them to this current domain (samat.org).

And I did. So, behold the script (save as index.php):

Simple enough. More is needed, though: the web server needs to be told to use this script for all URLs. You can do this easily with Apache and the ErrorDocument directive:

<VirtualHost 72.36.165.250:80>
  ServerName tamasrepus.hotnudiegirls.com
  DocumentRoot "/path/to/directory/containing/script"
  ErrorDocument 404 /index.php
</VirtualHost>

Trying to emulate mod_gunzip with Apache 2 Filters

The situation: I have gzipped content stored on an Apache 2 web server. Specifically, HTML files--they are stored in this manner to save disk space. For clients that can handle on-the-fly decompression of such files, I want the files to be sent verbatim; for clients that cannot, I want the content decompressed and sent to these clients.

mod_gunzip by Helge Oldach is an Apache 1 module made for dealing with stored gzip files. It can negotiate with a client what kind of encoding it can accept, and send the appropriate compressed or non-compressed version. Unfortunetely, at this time, this module is only available for Apache 1.

Helge Oldach notes that it should be possible to create the equivalent mod_gunzip functionality using only Apache 2 filters. To an extent, yes. I've done so:

ExtFilterDefine gunzip mode=output cmd="/bin/gunzip"

<Files *.gz>
  SetOutputFilter gunzip
</Files>

This won't do the sophisticated (well, at least more sophisticated than the Apache 2 runtime configuration directives will allow) negotiation that mod_gunzip can do, watching for certain clients and combinations of headers.

So, I'm stuck. I've a project I had been working on for school that involves HTML reports, collectively, that can be as large as 1.3 GB. Compressing each file with gzip decreases the collective size down to 300 MB, while still allowing the files to be viewed in most modern web browsers (apparently, at the time of this writing, this does NOT include Apple's Safari (which happens to be used by several of my professors), though Konqueror/KHTML works fine).

Note to self: port mod_gunzip to Apache 2.

Syndicate content