December 2010 Archives

We live in a world of rapidly moving technology trends, and fast-paced consolidation. What will the data centre look like in the coming years, as we begin to deploy these new concepts? The signs are that there is plenty of room for improvement in the way that we manage our equipment and vendors are inventing ways to help us.

We have spent the past few years consolidating servers to virtualise our computing capacity. Some of us have also virtualised storage through the use of storage area networks (SANs). A few people are also starting to address data centre networking, using high-speed, lossless ethernet to run everything from fibre channel through to iSCSI, and virtualised server traffic over the same backbone.

But the real magic will happen when we begin to put all these things together. For example, storage management tools have focused largely on the management of physical units and their support of various logical unit numbers (LUNs) in the past. They haven't particularly addressed virtualised servers. Similarly, the hypervisor management software used to shunt virtual machines around hasn't been developed with storage management in mind. More often than not, administrators will buy separate tools to accomplish these tasks well. 

And then, there are the links that tie them together. Network management software focuses more on maintaining link quality, rather than acknowledging the other two pillars of the data centre environment. 

Atop all this sits the application portfolio, which is what the computing infrastructure is there to support in the first place. And yet, for the most part, the datacentre infrastructure isn't intimately aware of application performance. Just ask the datacentre manager for a service level agreement on email delivery or screen response time, and see what he says.

The consolidated data centre will change this. The idea is that all these pillars become highly aware of the others. Storage resource management software will begin to understand the data that particular computer servers need to maintain performance. Management of the virtualised servers will be tied intimately to application performance. And it will be done, ideally, in an environment where everything has been consolidated, and where inefficiencies and over-provisioning problems have been driven out of the system.

It's a nice idea. Cisco has been working on it extensively as part of its UCS initiative (and annoying server vendors in the process, by muscling into their markets). Conversely, server experts such as HP (with its acquisition of 3Com and 3PAR) and Dell (with its $1 billion acquisition of Compellent, and other purchases such as Equalogic) are muscling into the networking and storage markets. And IBM? Well, IBM has pretty much two of everything.

The reason that these vendors are growing through acquisition and fleshing out their portfolios is because they want to be able to offer all parts of the data centre portfolio to their customers, and they want to be able to integrate them effectively to add value.

Some vendors have gaps in their portfolios. Cisco's home-grown storage presence is non-existent, for example. But in such cases, they are serving the market through partnerships. Cisco and EMC have been in bed together for some time. This gives Cisco access to EMC's storage and virtualisation expertise, while EMC gets to embrace Cisco's networking prowess.

This idea of a consolidated, high-performance data centre, in which storage, servers, virtualised operating systems and applications all talk to each other over a single, standardised high-speed transport layer is utopian, but deals and partnerships such as these show that the vendors are committed to making it happen. It will take a while to emerge, especially as cash-constrained companies still reeling from the financial crisis are unwilling to rip and replace legacy equipment.

IT professionals working in these environments would do well to watch for this coming trend. It will change the required skill sets necessary to tackle administrative tasks in tomorrow's data centre. Will your CV reflect what is needed?

I'm a fan of agile approaches when they fit your project. But that doesn't mean I never use other approaches to projects.

I often use iterative approaches, when I want to try something and experiment a while. I can start with a small part of the system and iterate on that part until I understand it. If I'm the project manager, I can ask a team of people to do that show me the results.

I like incremental approaches too, where we build a piece of the system and then another piece and then another until we are done. I like seeing the system come together whether I'm using timeboxes or not.

Demos serve two purposes in any project: to see visible progress and to acquire feedback. If you're using a serial lifecycle, build in demos at least every three months. That way, you can't go too long without any feedback about the product, or feedback about the general architecture of the product.

If you use an iterative approach, consider a demo every two-to-four weeks. If you are iterating in longer cycles, your iterations are too long--you are trying to do too many experiments for the time. Break your experiments into smaller chunks, and then demo. (BTW, a strictly iterative lifecycle does not have timeboxes necessarily; you answer questions each iteration and then decide what to do. You want to make sure you don't have too many questions to answer for each iteration.)

With an incremental approach, you're working feature-by-feature, or feature set-by-feature set. I almost had a disastrous program when I used an incremental approach, and let the increments get too big. We waited almost two months for a demo. The project team was so pleased--and the product manager was not. We made our increments smaller and gave more frequent demos.

So, no matter what kind of project you have, consider a demo as part of your project management. You'll see the team's progress, the team will see its progress, and the product manager, sponsor, customer, whomever you've got will see what the team has done. Wins all the way around!


Yesterday I attended Cloudstock, a pre-conference event put on by Salesforce.com as part of its Dreamforce event in San Francisco.

One of the speakers was Ryan Dahl, the author of node.js. He is smart and opinionated and gave a though-provoking introduction to his project. He has a couple of core arguments. The first is that I/O is slow, and therefore all I/O calls should be non-blocking.

 

ryan-dahl.jpgThe point here is that access to the disk or the network is hundreds of times slower than access to memory. A few milliseconds may not matter much in a single-user desktop application, but if you are coding a server with many concurrent users it becomes important.

You see this problem magnified many times in web pages that allow calls to remote servers to block the loading of a page. Web developers have learned to use callbacks for things like database queries that can take a while to return. Clients like Flash and Silverlight have no other mechanism for web service calls; doing these via callbacks is built in.

 Dahl's second point is that threading does not scale well. He showed a slide comparing the performance of Apache to Nginx as the number of concurrent connections increases. It showed Apache's memory usage growing while that of Nginx hardly changed. Dahl said that Apache uses a separate thread for each connection, but Nginx uses an event model, which is why it performs better under heavy load. 

Dahl says threads consume too many resources and complicate coding because you have to worry about synchronisation. He adds that Nginx will "probably take the place of Apache" because of its better performance.

Put these two things together and you get Node.js, which is described on its home page as Evented I/O for V8 Javascript, the high-performance Javascript engine in the Google Chrome browser.

Node.js is a binding for V8 that runs on Mac, Linux, Solaris, or with a bit of effort Windows. You would typically use it for real-time server applications that need to support many concurrent users. Dahl actually coded a chat server during the session, though our attempts to connect were defeated by the firewall.

In an aside, Dahl calls HTTP "the worst protocol on earth". He adds that we are now stuck with HTTP/1.1 for ever. Node.js uses Transfer-Encoding chunked to overcome some of its limitations.

Another of his points is that we are too much tied to web servers. "There's all these weird concepts in web stacks, why not just speak HTTP to things?"

Node.js has great performance, but it is low-level and further wrappers and libaries are appearing to make it more productive to use. What might it become in future? After the talk had ended, Dahl took some further questions and admitted that he would love to see it become the next PHP.

A great session; and whether or not you agree with all Dahl's remarks they are a welcome reminder that despite all the effort that has gone into the computing platforms we use constantly, there are things that could be done much better.

Node.js is in effect public domain software, says Dahl. It is not fully done yet, but in 6 months or so the API will be stable. Well worth a look. So, for that matter, is Nginx.

Current Vacancies from CWJobs

(* Required field)










Preferred format