« February 2008 | Main | April 2008 »

March 2008

March 31, 2008

UniCluster 3.2 Released

code_000000237891Small.jpgThe grid.org team has just declared UniCluster 3.2 to be stable.

  • Users can now install UniCluster Express over an existing Grid Engine installation.
  • Expanded platform and operating support, including native 64-bit system support.
  • The installation directory and service user are now created at installation, if they do not exist already.
  • Removed several installation prerequisites, making installation easier and faster.
  • Maintenance in the form of defect repairs. Refer to bugzilla for specific details.

In particular, being able to install over an existing Grid Engine installation is super cool. This is a feature that I've been excited about for a long time as it brings globus, ganglia and the UniCluster monitoring application to the existing 10,000 or so Grid Engine clusters.

Download

March 20, 2008

Globus Selected as a Google Summer of Code 2008 Mentoring Organization

iStock_000002311523Small.jpgThe Globus Alliance has been selected as a Google Summer of Code 2008 mentoring organization. Google Summer of Code (GSoC) is a program that offers student developers stipends to write code for various open source projects. Google works with several open source, free software, and technology-related groups to identify and fund several projects over a three month period. Historically, the program has brought together over 1,500 students with over 130 open source projects to create millions of lines of code. The program, which kicked off in 2005, is now in its fourth year.

If you are a student and would be interested in participating in GSoC with Globus as your mentoring organization, please take a look at our GSoC Ideas page. This page lists projects that Globus has proposed for GSoC, but it is not a closed list. If you have an idea for a cool project that uses or extends Globus technologies, please take a look at our list of Globus GSoC mentors and contact the one which most closely matches your interests. Take into account that student proposals must be submitted by March 31st and that you must meet Google's student eligibility criteria.

If you have any questions about our participation in GSoC, please contact the Globus GSoC administrators.

March 18, 2008

All Jobs Are Not Created Equal

handstand_000004002888XSmall.jpgChoosing a distributed resource management (DRM) software may not be a simple task. There are a number of open source or commercial software packages available, and companies usually go through product evaluation phase in which they consider factors like software license and support costs, maintenance issues, their own use cases and existing/planned infrastructure, etc. After following this (possibly lengthy) procedure, and finally making the decision, purchasing and installing the product, you should also make sure that the DRM software configuration fits your cluster usage and needs. In particular, designing the appropriate queue structure, configuring resources, resource management and scheduling policies are some of the most important aspects of your cluster configuration. At first glance devoting your company's resources into something like queue design might seem unnecessary. After all, how can one go wrong with the usual "short", "medium" and "long" queues? However, the bigger your organization is and the more diverse computing needs of your users are, the more likely it is that you would benefit from investing some time into designing and implementing queues more efficiently. My favorite example here involves high priority jobs that must be completed in a relatively short period of time, regardless of how busy the cluster is. Such jobs must be allowed to preempt computing resources from other lower priority jobs that are already running. Better DRMs usually allow for such use case (e.g., by configuring "preemptive scheduling" in LSF, or using "subordinated queues" in Grid Engine), but this is clearly something that has to be well thought through before it can be implemented. In any case, when configuring DRM software, it is important to keep in mind that not all jobs (or not all users for that matter) are created equal...

March 11, 2008

Are You Looking for a Grid Job?

distributed.jpgMy team is looking for some folks to join up and help us bring grid technology to our customers. Drop me a line!

All of Your Data in One Basket

disk_000000967564XSmall.jpg

I once worked with this person who wrote programs that only wrote to a single file. Once this program was put into the grid environment it would routinely create files that were hundreds of gigabytes in size.  Nobody considered this to be a problem because the space was available and the SAN not only supported files of that size, but also performed amazingly well considering the expectations. While this simplifies the code and data management, there are a number of reasons why this is not a good practice.

  • You don’t always need all of the output data at once. Moving a piece from the grid to your desktop for testing would not even be a consideration.
  • The amount of computation-time needed to recreate a huge file is significant.
  • There is no easy way to get to use multiple threads for writing and/or reading data.
  • Moving files across the network takes a lot more time.
  • A file can only be opened in read-write mode by one process at a time.  One large file is going to block a lot more modification operations than several single files.
  • Backing the file up is remarkably more difficult.  You cannot just burn it to a DVD so it has to be sent to disk or to tape.  If you need to restore a file it can take a significant amount of time.
  • Your file is going to be severely fragmented on the physical drives and therefore will cause increased seek times.
  • You can no longer use memory-mapped files.
  • Performing a checksum on a large file takes forever.
  • Finally, if you had properly distributed the job across the Grid, you should not have such large files!!!

Why would anybody do such a thing?  All your data are belong to us?

March 05, 2008

Four Reasons to Attend the Open Source Grid and Cluster Conference

conference_000003749151XSmall.jpgWe're combining the best of GlobusWorld, Grid Engine Workshop and Rocks-a-Palooza into one killer event in Oakland this May. Here's why you should come to the Open Source Grid and Cluster Conference:

  • Great Speakers: We're going to have the rock stars of the grid world speaking and teaching.
  • Great Topics: Dedicated tracks to each of the communities being hosted.
  • Community Interaction: The grid community is spread all over the world, this will be a meeting place to get face time with the people you know by name only.
  • You Can Speak: We're currently accepting agenda submissions for 90 minute panels and sessions.
This should be a fantastic conference, I'll look forward to meeting you there.

March 03, 2008

Grid vs Clouds? Who can tell the difference?

clouds_000003876801XSmall.jpgThe term "cloud computing" seems to be attracting lots of attention these days. If you google it, you'll find more than half a million results, starting with Wikipedia definitions and news involving companies like Google, IBM, and Amazon. There is definitely no shortage of blogs and articles on the subject. While reading some of those, I've stumbled upon an excellent post by John Willis, in which he shares what he learned while researching the "clouds".

One interesting point from John's article that caught my eye was his regard of virtualization as the main distinguishing feature of "clouds" with respect to the "old Grid Computing" paradigm ("Virtualization is the secret sauce of a cloud."). While I do not disagree that virtualization software like Xen or VMware is an important part of today's commercial "cloud" providers, I also cannot help noticing that various aspects of virtualization were part of grid projects from their beginnings. For example, SAMGrid, one of the first data grid projects that served (and still serves!) several of Fermilab's High Energy Physics experiments since the late 1990's, allowed users to process data stored in multiple sites around the world without requiring users to know where the data will be coming from, and how will it be delivered to their jobs. In a sense, from physicist's perspective experiment data was coming out of the "data cloud". As another example, "Virtual Workspaces Service" has been part of the Globus Toolkit (as incubator project) for some time now. It allows an authorized grid client to deploy an environment described by the workspace metadata on a specified resource. Types of environments that can be deployed using this service range from atomic workspace to a cluster.

Although I disagree with John's view on the differences between the "old grid" and "new cloud" computing, I still highly recommend the above mentioned article, as well as his other posts on the same subject.