Azure Stack – my take 01-18

Introduction

In the first few months after Azure Stack was announced, there was quite a bit of buzz around what it promised.

A true hybrid cloud experience, allowing workloads to move seamlessly between public Azure and your private Azure Stack data centre.

If anybody could deliver this, you’d think Microsoft could.

Later than expected, it has now been released under General Availability. This post takes a look at a couple of factors that I believe are key to the success of Azure Stack.

Scalability

Azure Stack is a fixed size hyper-converged platform. That is, the compute, storage and networking are tightly integrated with an overlaid software architecture. The fixed aspect refers to the fact that when you buy a Stack, you are buying a fixed number of nodes e.g. 4, 8 or 12.

I’m not a massive fan of hyper-converged infrastructure unless it’s dedicated to a well known workload that you can scale the nodes to. As soon as you put inconsistent or unpredictable workloads on there, you run the risk of, as an example, having to buy a new node (with all that compute, RAM and storage) just for the additional storage, even though your CPU and RAM utilisation might only be at 30% and 60% respectively. You can’t just buy more storage.

For me, one of the key definitions of cloud is scalability and flexibility. If you have an 8 node cluster, you don’t want to have those nodes sitting at 40% utilisation. You want them at near capacity, taking N+1 in to account.

I feel that the ‘pod’ approach that Azure Stack takes amplifies this problem even more. You can’t currently buy a 4 node pod and add another node when the cluster fills up. You need to buy another pod. That doesn’t come cheap.

I wonder, once the platform matures further, if Microsoft will allow single nodes to be added. It would mitigate the concern, but you are still limited to specific workloads if you aren’t going to be wasteful with your resources.

Feature parity

The big promise of hybrid cloud is running workloads in either your private data centre or the public cloud (Azure Stack and public Azure for the purpose of this post), migrate freely between the two, with a consistent experience regardless of where your workloads were.

That sounds like hybrid nirvana, but the current reality is less enticing. Stack was always going to deliver a subset of public Azure, but the feature gap today break’s the hybrid promise in their current state as far as I’m concerned.

The biggest difference is with the PaaS services. There are a number of hoops required to jump through to enable any level of PaaS services and requires licensing of additional VMs on the Stack to run some of those services, the latter point not really coming as a surprise though. For me however, PaaS is where the real benefits of migrating to cloud are reaped so this feels like a big bump in the road as it stands. A number of resource providers appear to be missing too. Again, I’m hopeful that as the platform matures, the capabilities gap will narrow considerably.

A great example of using a hybrid cloud setup is being able to DR your workloads from your private data center to the public cloud. You can currently do this with Azure Stack, but to fail the workloads back to your private Stack, you need to lift and shift them manually. This feels very much like a lock in to my sceptical mind. I can almost hear Admiral Ackbar shouting his warning out.

Microsoft are not offering any SLA on Azure Stack at the current time too.

Some of these shortcomings are likely to change over time but the key theme here for me is, what is the use case for purchasing Azure Stack? With no SLA, would you run your production on there? Would you use it for development on what is essentially a hobbled platform?

Summary

The idea of having a common interface to manage all your workloads, regardless of where they are hosted, is very appealing for obvious reasons. However, in its current incarnation, I can’t see a compelling reason to dive in to Azure Stack, although I have no doubt that over the next 1-3 years, it will mature to something that will genuinely be a game changer.

Have you deployed Azure Stack? If so and assuming you aren’t just talking about ASDK (the development kit that allows you to install Azure Stack on any tin), I’d love to hear what types of workloads you are running. How have you dealt with the shortcomings listed above? I’d love for you to reach out either on here or at my Twitter account to have a discussion.

Till the next time.

Cisco Live London 2012 Day 1

First of all, WOW. The vibe at Cisco Live London 2012 is quite amazing. A two minute walk from the Princes Regent DLR stop takes you in to the Excel exhibition centre and the registration process was over in another two minutes and the first souvenir of the week, the obligatory CL backpack, was in hand.

Need to look for a new laptop to fit…
Vendor stalls at the back, Meet the Engineer pods in white

The technical seminar I had signed up for was the ‘catchy’ sounding ‘TECVIR-2002 Enabling the Cloud: Data Center Virtualization – Applications, Compute, Networking and Best Practices’.

The three presenters over the day, which stretched to nine hours, were Carlos Pereira, Santiago Freitas and Ray O’Hanlon. Each had their own style but all were very capable speakers\presenters which kept me engaged for the individual parts which ran up to two hours each. Carlos in particular was a natural and the demonstrations given by Santiago were nothing short of breathtaking.

From the left: Santiago Freitas, Carlos Pereira and Ray O’Hanlon

I did think if nine hours was enough to cover the broad range of topics in any real depth but these guys have done this before and the fluff was kept to a minimum, at least for the first half of the day. Any attempt for me to judge the quality in the afternoon would be futile as I was just trying to understand as much as I could, despite the fact I have the slides to refer back to.

Fabricpath, UCS, OTV, LISP, FCoE, VXLAN all got good representation and of course how they relate to ‘the cloud’. I am thoroughly relieved to know that my idea of what cloud is matched fairly well to Cisco’s.  Note that this post is a general overview of the day. If you want to learn about the specifcs of these technologies, there are already plenty of online resources which do a better job than I could at this stage…my head is still, at 22:30 filing whatever it can remember away. Where it was evident that the topics could have been turned up further on the nerd meter to 12, references were made to the specific technical sessions later in the week with a suggestion to attend. Despite having swapped my schedule about several times in the preceding weeks, I think tonight will see yet another juggle!

What I liked today was that nobody’s knowledge level was taken for granted. The presenters were very good at sensing the tone when something being discussed needed more depth…probably the furrowed brows around the room. It was also amusing that some people were using today as a ‘how do I fix this issue in my production network’  session.

Matt’s takeaways

Firstly, I still struggle to see what questions a lot of the new technologies are trying to answer. For example, take OTV, please (OK, old joke). After discussing the innards of this technology, a quick poll around the room to count the number of people who were extending their layer 2 domain across physical sites caused one slightly shaky hand to raise. And it seemed that nobody was going to return to the office next week to implement it.

Secondly, as Bob Dylan said, the times are a changin’. Networking is undergoing a huge metamorphosis, unlike anything I’ve seen in my years in IT. Love it or loath it, cloud is here to stay and it’s going to take a whole new skillset just to understand it, let alone plan, design, implement and operate. The current standard of logging on to 50 TOR switches to configure individually could very well be coming to an end as the control plane is centralised. Add a super smart management platform on top and productivity has the potential to go through the roof. That’s once the questions are properly defined and the right answers agreed upon. That’s not even talking about the questions that are only relevant to you.

Finally, Cisco Intelligent Automation for Cloud (CIAC) looks like it has the potential to put a few people out of work, to say the least. The demonstration of LISP and OTV working together was very impressive, with a VMotion between data centres causing only a single ping packet to drop but what really stood out for me was the self-service portal demonstration which showed a brand new ESX host being deployed as production ready in less than 30 minutes with just a few clicks. In addition, a VM was deployed to another host with correct network settings (both at the VM and network ‘pod’ level) and security settings applied. It looked like a lot of work to set up, but a dream to run.

I’m goosed and have another 3.5 days to get through. Luckily, the rest of the week’s sessions are shorter. Here’s to learning new things.

Till the next time.