Reply to comment

Amazon A9's siteinfo.xml: almost a repeat of favicon.ico

Recently, I've received a few error 404s on a request for "siteinfo.xml." siteinfo.xml is a file used by Amazon's A9 search engine's browser toolbar SiteInfo, and is automatically fetched for every website a user visits.

This sounds pretty similar to Microsoft's Internet Explorer's infamous favorites icons feature. For every site a user visited with Internet Explorer, the browser would automatically request a file called favicon.ico, to be displayed in the browser's location bar and bookmarks. A lot of people were not happy--all of the sudden web servers would begin to get swamped for requests for this mysterious favicon.ico that did not exist. These requests polluted many web server logs, and were very annoying.

On some sites, especially dynamic ones, 404 errors are very expensive. Unfortunately this is true of most Drupal-powered sites, including mine. When using Drupal's "pretty URLs" which uses Apache's mod_rewrite to, well, make URLs pretty, all requests that the web server does not process (including errors) will go through Drupal. Going through Drupal means a long boot-strapping process to initialize Drupal and load all its modules, and at least one database request to find out a URL does not exist and to return an error 404. Too many requests for a non-existent file can basically become a DoS attack.

It seems Amazon's A9 developers didn't get the memo people don't like tools that request files that don't exist.

Granted, it's not too bad: I don't think this toolbar has much market penetration, so it's not as if millions of people are killing my site. The siteinfo.xml specification page also mentions that the file is fetched through A9 and cached, so the file will not be requested for every user that visits, but only for the first one.

Kudos for Amazon's programmers being a bit brighter than Microsoft's, but eh, I can't say how much more bright for designing a system that is a bit too similar to the favicon.ico debacle.

Trackback URL for this post:

http://samat.org/trackback/112

Reply

The content of this field is kept private and will not be shown publicly.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • Insert Flickr images: [flickr-photo:id=230452326,size=s] or [flickr-photoset:id=72157594262419167,size=m].
  • You may use [inline:xx] tags to display uploaded files or images inline.
  • You can use Markdown syntax to format and style the text. Also see Markdown Extra for tables, footnotes, and more.

More information about formatting options