The Elephant In The Cloud - Digital Innovation

With everyone moving more services to the cloud why are we doing the opposite and bringing services and systems back in house? In my previous post I mentioned cost being a key driver but this is just one of many reasons we are looking to do this.

For many years, before DevOps or DevSecOps were even buzzwords, the Digital Innovation team within the library have been following the cultural practices these aim to bring together. Being a small team it is very easy for us to follow these and makes supporting anything developed in house easier to do as we all share some level of knowledge with how services are developed, deployed and maintained.

When starting the DMAonline Project with Jisc as part of the Research Data Shared Service (RDSS), which became the Open Research Hub, we had the opportunity to start using the AWS cloud platform - as this is where our application would eventually be deployed. Giving us our first experience of using the public cloud within our development team we spent a bit of time exploring all the services available to us, very much a “giving the kids all the toys from the toybox” moment and were amazed at how simple and easy some of our tasks were becoming.

Wait …. Why move away from the public cloud then?

This may seem a shock to many, including my colleagues over in Information System Services (ISS), as I’ve been a rather vocal advocate of using the cloud, especially AWS, at Lancaster University as well as preaching about the benefits it can bring at many HE developer conferences. The key thing to realise here is that whilst the cloud has an amazing range of services, more than can ever be provided in an on-premise private cloud platform, it does have some downfalls for a small team working on a restricted budget.

As many of you in the higher education sector will be aware budgets and the constraints on these are always a challenge to deal with. With the ever increasing cost of our cloud services we were looking for ways in which we could reduce these whilst not impacting the level of services we provide. Knowing that the compute platform at Lancaster University, based on VMware, was undergoing some changes and could easily handle our core Virtual Machine based workloads we decided to work with the Virtual Infrastructure Team over in ISS to evaluate what they could provide for us and which of our cloud services could be migrated easily.

Across our 15 AWS accounts we are running a variety of different machine types in the cloud in some cases these would be over-resourced for the workload they are doing either in terms of virtual CPUs, memory or available storage bandwidth. By migrating these virtual machine workloads to be within the universities virtualisation platform we are able to specify machines with the correct balance of resources for the workloads we would be running. Given the breadth of services provided by the library we also have varying needs on the storage systems needing requiring both systems with low latency and high speeds as well as those providing high capacity. On AWS this can be sometimes be a problem when you are wanting lots of storage attached to a low resourced machine. Thankfully this isn’t an issue for us on premise and we are able to use the central SAN for providing high capacity storage, mainly for our digital assets, alongside the vSAN datastores for high speed, low latency storage and boot drives.

Often forgotton when moving services to the cloud is how handle to backups and ongoing support of the systems, by moving our core services to be within our on premise datacenters we are provided with daily backups of all our virtual machines as well as storage systems alongside having support readily available when there are issues with the underlying physical infrastructure and software. With all this provided without any extra effort from our team we are able to focus on the services running on top of this and the experience for our users.

Consolidating to Kubernetes

Whilst these discussions were going on we were also looking for ways to make our infrastructure easier to manage, consolidating the many different operating systems, platforms and deployment mechanisms to a more common way of working. Having evaluated kubernetes on the AWS platform both using raw virtual machines and the managed Elastic Kubernetes Service (EKS) we decided that with some staff development and training in the team this would provide a great platform for handling the majority of future work within the team and over time help us consolidate our sprawling virtual infrastructure estate.

With this in mind we settled on using the Rancher Platform and RKE to manage our kubernetes clusters providing a consistent interface to each of our environments and compute clusters that means no matter what environment, platform or service the team are developing on the experience is the same. This is also a great win for helping to induct student staff onto the team to work on projects as we can easily give them access to the resources they require for their work whilst providing them the experience of using a modern application deployment environment.

So not leaving completely

So whilst we are moving the core of our compute workloads back to the on-premise datacenters at Lancaster University some of our workloads will still be running in the cloud and we will keep looking at the services and opportunities that the public cloud, especially AWS, can provide for us and using these where appropriate.

Expect to hear more soon on how our migration is going and how we are setting up tools to help us manage our infrastructure and services.

← Previous Post