The strange things people look for on web-sites

I've just updated the DWR website to use Drupal rather than SnipSnap. And in doing so quite a few of the URLs have changed, so I waved my Apache/mod_alias wand around to point people where they should be going. I've had the naive aim that I should be able to nearly get rid of 404s.

I've been really surprised by the bizarre things people look for on my web site. For example:

  • robots1.txt: Is that just a typo? Or is there something I'm missing. All the hits are from AOL users. But that could be a coincidence.
  • DWR实时系统.files/drupal.css: I guess that's a typo, but the drupal style sheet isn't related to DWR in any way. So I don't know how you would come to make such a typo.
  • siteinfo.xml: I guess someone is hoping there is a hidden Eclipse plug-in for DWR and are trying to find it.
  • DWR%20-%20Overview%20Getahead_archivos/style.css: There is a referrer, but not with links to that file.
  • There are a lot of hits for files that exist, for example intro1.jpg (one of the pics on the front page), but with paths that are just wrong. Why would (quite a few) people be looking for files that they know to exist somewhere in other random places?

Has anyone else wondered where all this wierd stuff is coming from or found bizarre URLs?

Update: Some of these are tools, and some just have to be human - the Eclipse one for example, but I've just found a new one that is a bot. What seems like a broken Safari. Get this:

  • /dwr/atom/feed followed by /dwr/atom/atom/feed followed by /dwr/atom/atom/atom/feed followed by ... he gave up at 16 atoms!: The user-agent is 'AppleSyndication/38' which it appears is how Safari's RSS support manifests itself.

Further Update: I don't think this Yahoo FAQ explains any of the URLs above, but it does explain a valid reason why some bots from search engines go looking for semi-random URLs.

Technorati tags: ,


Comments have been turned off on old posts