Dec 9 2009

When you write something that connects to a web server, what user agent do you use?

Far too often have I seen things like:

curl_setopt($c, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");

In my opinion you should always use a nice descriptive user agent that explains to the server exactly what your client may be trying to achieve, or at least a unique identifier. Unless you’re trying to achieve some kind of web scraping client (which probably contravenes some terms of service agreement somewhere, so I certainly don’t advocate that!), there is no reason not to provide a useful and descriptive UA string.

A good UA string from a little-known client should provide some way of contacting you. When I say little-known, I mean something like your new web app that you’ve just made that queries Last.fm for user data. In this instance, I’d give a nice descriptive UA string with contact e-mail, e.g.:

curl_setopt($c, CURLOPT_USERAGENT, "MyLastFmClient (v0.1) myemail@address.com");

As your client becomes more used, or you already have a decent way of contacting you on your website, perhaps just put a URL:

curl_setopt($c, CURLOPT_USERAGENT, "MyLastFmClient (v1.2) www.address.com");

Of course, when you’re Google for example, everyone knows who you are, so for example the UA string “Mediapartners-Google” yields 200k-odd results, revealing that this is the AdSense content bot.

Why do I think this is important? It helps servers identify you and help you in most instances. If your client goes wrong and gets itself stuck in a loop because you forgot to increment $i for example, that server can see that MyLastFmClient for example is spamming the server with 1,000+ requests a minute. They can then see your UA string and contact you about it.

Another reason is that some servers might actually block access, or provide different content depending on the client. I know that Google serves up a completely different search results page if you’re on IE5 than in IE8 for example. Another server might block all known browsers for example from accessing a web service (e.g. with the message “this page cannot be accessed using a web browser”). I’m not saying this is good or bad practice as that is a WHOLE other kettle of fish – but I’m just saying it can happen, and that sort of thing can be pretty hard to track down.

Although this all might seem pretty trivial, it is useful, and I think any HTTP(S) client should identify itself properly using a clear and descriptive user agent string. It’s no harder to do and it just makes everyone’s lives easier!

Oct 9 2009

Today I had a TOS Violation on my Linode VPS for sending spam e-mails. Odd I thought as I only have 1 mail user. So I stopped Postfix and examined the logs and it was pretty obvious that it was sending lots of spam e-mails. Odd though that the connections were coming from 127.0.0.1… which pointed that it could be a script or something like that… but I only have 3 websites on here, all low-traffic and I’m fairly confident there’s no security issues there. I checked through the Apache access logs, but nothing seemed odd there. I then did a “netstat -a” and discovered there were hundreds and hundreds of connections to a particular port. Running “netstat -pant” showed these connections were to Squid – which is where the problem lay. Basically a couple of weeks ago, I installed Squid (a proxy server) to play around with. I configured it so I could use it as an HTTP proxy. Unfortunately I left a gaping hole where anyone could’ve used the proxy for anything. Unfortunately for me, it allowed spammers to send mail via my otherwise secure Postfix installation… but of course Postfix didn’t require SASL Authentication because the connections were from 127.0.0.1 – it’s own network. Anyway, it seems to be fixed now (I removed Squid altogether as I have no real use for it), but this highlights the importance of configuring stuff properly so that all possible security holes are sealed shut!

May 11 2009

I’m sure you’ve all heard of Talk Like A Pirate Day, and here’s my rather geeky idea for another day. It’s Talk Like Apache Day!

Basically, you talk like an HTTP server (not specifically Apache, but “Apache” was similar to “A pirate”…). If you need help, here are some responses you can give people to confuse them.

Other ideas could be borne from this, such as Talk Like an MTA Day, or Talk Like SSH Day. I expect the later would have to be encrypted though…

Just imagine the conversation anyway:

You: Hi James, how are you?

Me: 200 OK

You: What?

Me: 304 Not Modified

You: I don’t understand…

Me: 304 Not Modified

You: You’re such an idiot…

Me: 400 Bad Request

Feb 21 2009

zendserverconfigOK, so far all that I’ve managed to do is install it and have a dabble with the config pages and go “oooh that looks pretty”, so this isn’t a hardcore review or anything.

Zend have unveiled their newest product, Zend Server… which is essentially Zend’s own W/M/LAMP stack, but with Zend Framework and other components Zend have written, including the very handy Zend Debugger. What does that mean? Well to me, that means there’s quite an easy choice for my web development at home – I just installed it in 10 minutes and now have a fully working WAMP stack I can develop on before pushing to my Linode test server. It was 100 times easier than any other WAMP stack I’ve worked with including XAMPP and the other ones I’ve tried. It has a very shiny web GUI as well (pictured), that – as I mentioned before – I went “oooh” at lots. I personally think Zend Server has the potential to be really frickin’ awesome if I get to know it better. From the Public Beta Invitation e-mail, Zend states it includes:

  • Fully supported and certified distribution of PHP 5.2
  • Fully supported Zend Framework 1.7 release
  • Integrated native installers (RPM/DEB/MSI)
  • Web-based administration Interface
  • Comprehensive out-of-the-box database connectivity
  • Powerful PHP monitoring capabilities to identify problems and help fix them quickly
  • URL-based output caching required by today’s modern web applications
  • Zend Optimizer+ – byte code cache to boost application performance
  • New “Guard Loader” to enable processing of Zend Guard encoded files

Not bad – and there’s a community edition too, which means if you’re a sole developer like me it’s affordable.

<rant>Unfortunately, they don’t do a community edition of Zend Studio for Eclipse… and although PDT is good, I feel like its the hacky “well Zend Studio uses PDT at it’s core” alternative – without the cool enhancements that ZS has… oh well!</rant>

Oct 21 2008

I followed some guide on the interwebs this evening to set up a password-less SSH connection to my server. I followed all the steps correctly, but kept getting “Server refused our key” in PuTTY.

Thankfully, after a quick Google, this guide helped me out and got it working.

The solution is that Windows sucks, and you should always generate your keys in Linux.