Browsed by
Category: My Research

Posts about my research on the way to a PhD.

Back in San Diego (Temporarily)

Back in San Diego (Temporarily)

I'm back in San Diego for the next week and a half in order to graduate. I defend in one week on the 27th. The photo above is of a tarantula that lives in the office that I used to sit in, and that I am sitting in again while I'm here. That is its full significance to this post.

If the tarantula isn't big enough above, you can make it bigger by clicking on the image!

SDSC Optiportal

SDSC Optiportal

My adviser Professor Mike Norman, as part of his job at the San Diego Supercomputer Center, purchased an optiportal system for the new SDSC building which is opening today. An optiportal system is a wall of monitors powered by networked computers such that the screens behave as one monitor. Very high resolution images and movies can be tiled across the screens, as you can see below. Movies and animations can also be tiled across the screens.

IMG_5622

...

Read More Read More

Cray Coolness

Cray Coolness

I'm back on the new supercomputer in Tennessee: the Cray XT4 Kraken. The coolest command on the computer, in my opinion, is xtshowcabs. Below is the (anonymized) output. This shows which job is running on each node (each processor has four cores in one processor). The lower-case letters correspond to jobs listed at the bottom. Each vertical set of symbols (eight wide, twelve high) is a physical cabinet of nodes*.

What you see below is one job running 8192 cores (a), another running on 4096 (h), one with 2048 (k) and a smattering of smaller jobs. My jobs are i and j, running on 8 cores each. The computer is just about full here, about 96% usage.

This also allows me to know who to blame when my jobs are sitting waiting to start for days.

Compute Processor Allocation Status as of Fri Aug  1 18:14:16 2008

     C0-0     C0-1     C0-2     C0-3     C1-0     C1-1     C1-2     C1-3     
  n3 -------- -------- hhhhhhhh hhhhhhhh SSSaaaaa aaaaaaaa aaaaaaaa aaaaaaak 
  n2 -------- -------- hhhhhhhh hhhhhhhh    aaaaa aaaaaaaa aaaaaaaa aaaaaaak 
  n1 -------- -------- hhhhhhhh hhhhhhhh    aaaaa aaaaaaaa aaaaaaaa aaaaaaak 
c2n0 -------- -------- hhhhhhhh hhhhhhhh SSSaaaaa aaaaaaaa aaaaaaaa aaaaaaak 
  n3 SSSSS--- -------- hhhhhhhh hhhhhhhh SSSSaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2      --- -------- hhhhhhhh hhhhhhhh     aaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1      --- -------- hhhhhhhh hhhhhhhh     aaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c1n0 SSSSS--- -------- hhhhhhhh hhhhhhhh SSSSaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 SSSSSSSS -------- --lihhhh hhhhhhhh SSSSaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2          -------- --ljhhhh hhhhhhhh     aaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1          -------- --ljhhhh hhhhhhhh     aaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c0n0 SSSSSSSS -------- ---lihhh hhhhhhhh SSSSaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
    s01234567 01234567 01234567 01234567 01234567 01234567 01234567 01234567 

     C2-0     C2-1     C2-2     C2-3     C3-0     C3-1     C3-2     C3-3     
  n3 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c2n0 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c1n0 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c0n0 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
    s01234567 01234567 01234567 01234567 01234567 01234567 01234567 01234567 

     C4-0     C4-1     C4-2     C4-3     C5-0     C5-1     C5-2     C5-3     
  n3 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c2n0 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c1n0 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c0n0 hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
    s01234567 01234567 01234567 01234567 01234567 01234567 01234567 01234567 

     C6-0     C6-1     C6-2     C6-3     C7-0     C7-1     C7-2     C7-3     
  n3 hhhhcccc cccccccc bbbbbbbb gggggggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhcccc cccccccc bbbbbbbb gggggggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhcccc cccccccc bbbbbbbb gggggggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c2n0 hhhhhccc cccccccc bbbbbbbb gggggggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 hhhhhhhh cccccccc bbbbbbbb bbbbgggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh cccccccc bbbbbbbb bbbbgggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh cccccccc bbbbbbbb bbbbgggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c1n0 hhhhhhhh cccccccc bbbbbbbb bbbbbggg aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 hhhhhhhh cccccccc ccccbbbb bbbbbbbb aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 hhhhhhhh cccccccc ccccbbbb bbbbbbbb aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 hhhhhhhh cccccccc ccccbbbb bbbbbbbb aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c0n0 hhhhhhhh cccccccc cccccbbb bbbbbbbb aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
    s01234567 01234567 01234567 01234567 01234567 01234567 01234567 01234567 

     C8-0     C8-1     C8-2     C8-3     C9-0     C9-1     C9-2     C9-3     
  n3 ggggffff ffffffff eeeeeeee dddddddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 ggggffff ffffffff eeeeeeee dddddddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 ggggffff ffffffff eeeeeeee dddddddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c2n0 gggggfff ffffffff eeeeeeee dddddddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 gggggggg ffffffff eeeeeeee eeeedddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 gggggggg ffffffff eeeeeeee eeeedddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 gggggggg ffffffff eeeeeeee eeeedddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c1n0 gggggggg ffffffff eeeeeeee eeeeeddd aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n3 gggggggg ffffffff ffffeeee eeeeeeee aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n2 gggggggg ffffffff ffffeeee eeeeeeee aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
  n1 gggggggg ffffffff ffffeeee eeeeeeee aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
c0n0 gggggggg ffffffff fffffeee eeeeeeee aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa 
    s01234567 01234567 01234567 01234567 01234567 01234567 01234567 01234567 

     C10-0    C10-1    C10-2    C10-3    C11-0    C11-1    C11-2    C11-3    
  n3 ddddkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
  n2 ddddkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
  n1 ddddkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
c2n0 dddddkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
  n3 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
  n2 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
  n1 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
c1n0 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk aaaaaaaa aaaaaaaa 
  n3 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkXkkk kkkkkkkk kkkkaaaa aaaaaaaa 
  n2 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkaaaa aaaaaaaa 
  n1 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk Xkkkkkkk kkkkaaaa aaaaaaaa 
c0n0 dddddddd kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk kkkkXkkk kkkkaaaa aaaaaaaa 
    s01234567 01234567 01234567 01234567 01234567 01234567 01234567 01234567 

Legend:
   nonexistent node                 S  service node
;  free interactive compute CNL     -  free batch compute node CNL
A  allocated, but idle compute node ?  suspect compute node
X  down compute node                Y  down or admindown service node
Z  admindown compute node           R  node is routing

Available compute nodes:       0 interactive,   149 batch

ALPS JOBS LAUNCHED ON COMPUTE NODES
Job ID     User       Size   Age              command line
--- ------ --------   -----  ---------------  ----------------------------------
 a  155793 xxxxxx      2048  9h00m            xxxxxxxx
 b  156058 xxxxxxxx     128  0h50m            xxxxxxxx
 c  156060 xxxxxxxx     128  0h50m            xxxxxxxx
 d  156062 xxxxxxxx     128  0h49m            xxxxxxxx
 e  156064 xxxxxxxx     128  0h49m            xxxxxxxx
 f  156066 xxxxxxxx     128  0h48m            xxxxxxxx
 g  156068 xxxxxxxx     128  0h48m            xxxxxxxx
 h  156080 xxxxxxx     1024  0h33m            xxxxxxxx
 i  156085 sskory         2  0h22m            enzo.exe
 j  156087 sskory         2  0h21m            enzo.exe
 k  156089 xxxxxxxx     512  0h04m            xxxxxxxx
 l  156091 xxxx           4  0h02m            xxxxxxxx

(* I'm pretty sure that's the layout. I could be wrong, so don't invade a medium-sized oil-producing country based on that intelligence.)

Supercomputers

Supercomputers

Faithful readers will remember me posing with my favorite supercomputer about a year ago. Datastar is going to be turned off in a few months. When it was turned on three years ago, it was the 35th fastest computer in the world, it has since slipped to 473rd. Despite the fact it's no longer the fastest thing around, it works wonderfully, and as I write this, there are at least sixty people logged onto this machine. Everyone I know loves Datastar, and wishes it wasn't going to be turned off. I am starting to move my work and attention to the newer machines. They are faster, and have many more processors, which makes queue times short (which is the time it takes for a job I request to run)


Ranger (credit)

A few months ago, Ranger was turned on. It is a Sun cluster in Texas with 63,000 Intel CPU cores. It is currently ranked fourth fastest in the world. Datastar has only 2528 CPUs (but those are real CPUs, while Ranger has mutli-core chips which in reality aren't as good). By raw numbers, Ranger is an order of magnitude better than Datastar, except that Ranger doesn't work very well. Many different people are seeing memory leaks using vastly different codes. These codes work well on other machines. I have yet to be able to run anything at all on Ranger. For all intents and purposes, Ranger is useless to me right now.

If you look at the top of the list of super computers, you'll see that a machine called Roadrunner is the fastest in the world. Notice that it is made up of both AMD Opteron and IBM Cell processors. The Cell processor is the one inside Playstation 3s. Having two kind of chips adds a layer of complexity, which makes the machine less useful. The Cell processor is a vector processor, which is only awesome for very specially written code. The machine is fast, except it's also highly unusable. I don't have access to it because it's a DOE machine, but a colleague has tried it and says he got under 0.1% peak theoretical speed out of it. Other people were seeing similar numbers. No one ever gets 100% from any machine, but 0.1% is terrible.


A Kraken

Computers two and three on the list are DOE machines, so I don't have access to them. On the near horizon is a machine called Kraken, in Tennessee. It's being upgraded right now, but when it's complete it will be very similar to, but faster than the fifth fastest computer on the list currently, called Jaguar. It is a Cray XT4 that runs AMD Opteron chips. I got to use Kraken recently while it was still an XT3, and it was awesome. Unlike Ranger, it actually works. As an XT4, it should be even faster than Ranger. It will also have a great tape backup system, unlike Ranger.

I am predicting that Kraken will be come my new favorite super computer, replacing Datastar. However, I think it's a shame that Datastar is being turned off even though it's still very useful and popular. When it's turned off to make way for machines like Ranger and Roadrunner(*), that's just stupid.

(*) The pots of money for Ranger, Datstar and Roadrunner are different, but you get the point. Supercomputers aren't getting better; in some cases, they're getting worse!

OpenMP

OpenMP

I'm all about graphs lately...

The graph above shows the speedup that a few OpenMP statements can give with very little effort. OpenMP is a simple way to parallelize a C/C++ program which allows you to run a program on many processors at once. However, unlike MPI which can run on many different machines (like a cluster), OpenMP can only be run on one computer at a time. Since most new machines have multiple processors (or cores), OpenMP is quite useful.

I've added a couple dozen OpenMP statements to the code I'm working on. The blue line shows how long (in seconds) it took me to run a test problem on between one and 32 processors. The green line shows the speedup compared to running on a single processor as a ratio of time. It is very typical of parallel programs that the speedup isn't linear and flattens out at high thread count. This small test problem deviates at 16 processors; when I do a real run (which will be much larger and the parallelization more efficient) I may see nearly linear speedups all the way to 32 processors.

I think it's pretty neat how with very little effort I was able to significantly speedup my code. If you have a little programming experience, you can take a look at some simple OpenMP examples and see for yourself just how easy OpenMP is.