Above is a small part of my music listening history as reported to last.fm over the last year and a half. Time is plotted left to right, overall number of tracks by the width of the shape, and the colors represent individual artists. I used LastGraph3, which if given your user name will make a set of graphs from your data.
If you click on the image above, you’ll see the full history. It looks like I go through periods where I listen to a fair bit of music, and then stop, and start again. I think there’s a fair amount of smoothing of the data. I think my history would look even more jagged without smoothing.
I like plots like this because they show multidimensional data using colors and shapes in an intelligent way. Of course the classic example is Minard’s famous depiction of Napoleon’s 1812 Russian campaign. I think everyone should have to learn how to make good plots, and understand how to read one. When I was a TA, I constantly had to remind students the point of making graphs – I think nearly all of them felt it was busy work rather than a way to organize and visualize data; a way to recognize a physical effect.
Just like significant figure errors (I am bothered enough by those to contact newspaper reporters: I’ve done it in the past), I cringe at the sight of misleading or poorly organized graphs. The worst offenders tend to use Excel, whose plots are instantly recognizable as probably being garbage. I also dislike the USA Today charts and many plots seen on the various network evening news shows. Too much artistic influence from graphics artists (no offense K.P.!), and not enough substance.
Two and a half years ago I moved my website off my father’s computer at home to Site5. For a while it was great, especially compared to serving a website over a cable modem connection. However, over the last year or two it’s gotten progressively worse, something I discussed in this post about a year ago. Also over a year ago, Site5 promised to move everyone to new servers. It hasn’t happened, and my service has gone steadily downhill.
My first two-year prepaid period with Site5 went up in December last year, and I seriously thought about moving. I looked at other shared hosting companies, but I felt I would probably have the same problems on a new shared host. I looked into hybrid solutions, but that too didn’t seem a guaranteed improvement. I liked the idea of Virtual Private Servers (VPS), but I couldn’t find one with enough disk space in my budget.
A few months ago, my lab mate Rick pointed me towards s3fs, which intrigued me. s3fs puts your data on Amazon S3, but allows the data to appear to be local to the server, like another hard drive. You pay for only what you use with S3, and it has virtually unlimited space. Suddenly, a VPS hosting solution fit into my budget. I could pay for a VPS with less disk space than I needed, but still get the power of VPS. It was also an upgrade because now me and my family could upload as much data as we wanted, and it would be much more secure from disk failure than before.
This website and other sites that were on the old server are now being hosted on a machine from linode.com. I’m using their lowest option, which has 10GB of space. I installed Ubuntu Hardy Heron which seems like a solid Linux distribution. s3fs has proven to be reliable and fast enough, although it’s much slower than having the data on a local disk. Using Apache rewrites, my father and I have made it such that when a web browser asks for items on a page that exists on S3, the request goes there instead from this server, which saves lots of time. I’ve also figured out how to shoehorn Gallery2 into using S3.
I have written thrice (1, 2, 3) in the past about the new Yahoo! mail interface, the Ajaxed interface to Yahoo! mail. It is incredible how slowly they make improvements to it. It’s not like Yahoo! cares what I say, but of the points I raised over two years ago in my first post, they still haven’t all been fixed.
But Yahoo! maybe trying harder. There is now a preference to add the greater-than signs on replied to messages:
Which is great. Until you try to use it. Here is a message I sent myself:
Here is what I get when I hit ‘reply’ (this is a screen shot of the compose window, the text is editable):
Yes, each and every word of the message I’m replying to gets its own line. But it gets worse! Here’s what I get when I send the replied message without touching anything:
Here each word of the replied to message gets its own line separate from the greater-than signs. I hope this is just a simple bug (I will submit a bug report about this) but this is simply ridiculous.
Yahoo! has rolled out a new search service with predictive search hints. As you type it tries to guess what you’re searching for. It will also give you an ‘Explore Concepts’ tab which gives you words to associate with the search you just performed. I was playing around with this and I searched my friends name, Chris Nekarda, and I discovered that Yahoo! pretty much has him nailed down.
Donuts are truly the gateway to understanding his soul. It does a slightly worse job with me. I don’t own anything made by Nikon.
A week ago I saw a posting on digg about how Dreamhost sucks. It got me thinking about the problems I’ve been having with my account with Site5. Let me say that at no point has my experience been anywhere as bad as the one described in that link.
The number of users with logins is not a one to one list of websites served, but it’s probably a good estimate. All it takes is one of the 600 users with a bad webpage to clog up the machine.
Below is a plot of the load level for iaso over the course of 15 days last month. Without going into specifics, a load level of one means that there is one process needing a processor at any given moment. It will be many different processes, and that’s fine. Practically, a machine can stay responsive with up to about a load of three or four per processor. So, on a four processor machine like iaso, a load level of 10-15 is about the highest comfortable level.
What you see above are many occasions when iaso went well above load levels of 25. The highest peak was a load of 230. In my experience, once the load reaches 25 my website becomes more than slow: it doesn’t work anymore.
For comparison, below is a plot of the same thing on one of the nodes of the supercomputer I use, Datastar. This is the node where scientists do heavy-duty analysis on their datasets. For instance, I use this node to process my multi-gigabyte datasets using IDL. People also run Mathematica and other very computationally intensive tasks on this machine. It’s got 32 1.7Ghz Power4 processors and 256GB of RAM (what do you have on your workstation, huh?). It runs IBMs AIX 5.3. As you can see below, for the first four days, the load level stays below one process per processor. In the fifth day something happens and it goes above 60 for a while, before the machine gets rebooted and things return to normal.
The kind of processes that run on the two computers above are very different. However, the supercomputer is supposed to run big jobs and get beat on. A webserver isn’t. Anytime the webserver’s load goes above 25, it’s like the supercomputer’s load shooting to 256. At no time did the supercomputer shoot to 256, while the webserver goes above 25 many times. Of course, I’m comparing 15 days to 5 days in the two plots, but I think the differences are clear.
Site5 pays a third party to monitor their webservers, with results listed here. iaso has 99.8% uptime overall and 99.4% over the last month. This is bad enough that apparently I’m due a 5% credit on my next billing cycle. iaso isn’t even living up to Site5’s own service standards.
Every time I catch my website being slow, I contact Site5 tech support. I know that this is a common problem with shared hosting. I’m sure that Site5 is aware that these outages, and does what they can when they happen. But, when it does happen, it’s annoying. It shouldn’t happen in the first place. Sometime this summer, Site5 is changing their hosting solution which may help with these problems. We’ll see.