Why Automation Matters : ITIL

ITIL is quite a divisive subject in the geek world. Once the subject is raised most of us geeks start channelling our inner cowboy/cowgirl thinking we don’t need the shackles of a formal process, because we know what we are doing and don’t make mistakes. Once something goes wrong everyone looks around saying, “I didn’t do anything!”

Despite how annoying it can seem at times, you need something like ITIL for a couple of reasons:

  • It’s easy to be blinkered. I see so many people who can’t see beyond their own goals, even if that means riding roughshod over other projects and the needs of the business. You need something in place to control this.
  • You need a paper trail. As soon as something goes wrong you need to know what’s changed. If you ask people you will hear a resounding chorus of “I’ve not changed anything!”, sometimes followed by, “… except…”. It’s a lot easier to get to the bottom issues if you know exactly what has happened and in what order.

So what’s this got to do with automation? The vast majority of ITIL related tasks I’m forced to do should be invisible to me. Imagine the build and deployments of a new version of an application to a development server. The process might look like this.

  • Someone requests a new deployment manually, or it is done automatically on a schedule or triggered by a commit.
  • A new deployment request is raised.
  • The code is pulled from source control.
  • The build is completed and result of the build recorded in the deployment request.
  • Automated testing is used to test the new build. Let’s assume it’s all successful for the rest of the list. The results of the testing are recorded in the deployment request.
  • Artifacts from the build are stored in some form of artefact store.
  • The newly built application is deployed to the application server.
  • The result of the deployment is recorded in the deployment request.
  • Any necessary changes to the CMDB are recorded.
  • The deployment request is closed as successful.

None of those tasks require a human. For a development server the changes are all pre-approved, and all the ITIL “work” is automated, so you have a the full paper trail, even for your development servers.

It’s hard to be annoyed by ITIL if most of it is invisible to you! 🙂

IMHO the biggest problem with ITIL is bad implementation. Over complication, emphasis on manual operations and lack of continuous improvement. If ITIL is hindering your progress you are doing it wrong. The same could be said about lots of things. 🙂 One way of solving this is to automate the problem out of existence.

Cheers

Tim…

Why Automation Matters : Patching and Upgrading

As I said in a recent post, you know you are meant to, but you don’t. Why not?

The reasons will vary a little depending on the tech you are using, but I’ll divide this answer into two specific parts. The patch/upgrade process itself and testing.

The Patch/Upgrade Process

I’ve lived through the bad old days of Oracle patching and upgrades and it was pretty horrific. In comparison things are a lot better these days, but they are still not what they should be in my opinion. I can script patches and upgrades, but I shouldn’t have to.  I’m sure this will get some negative feedback, but I think people need to stop navel gazing and see how simple some other products are to deal with. I’ll stop there…

That said, I don’t think patches and upgrades are actually the problem. Of course you have to be careful about limiting down time, but much of the this is predictable and can be mitigated.

One of the big problems is the lack of standardisation within a company. When every system is unique, automating a patch or upgrade procedure can become problematic. You have to include too much logic in the automation, which can make the automation a burden. What the cloud has taught us is you should try to standardise as much as possible. When everything most things are the same, scripting and automation gets a lot easier. How do you guarantee things confirm to a standard? You automate the initial build process. 🙂

So if you automate your build process, you actually make automating your patch/upgrade process easier too. 🙂

The app layer is a lot simpler than the database layer, because it’s far easier to throw away and replace an application layer, which is what people aim to do nowadays.

Testing

Testing is usually the killer part of the patch/upgrade process. I can patch/upgrade anything without too much drama, but getting someone to test it and agree to moving it forward is a nightmare. Spending time to test a patch is always going to lose out in the war for attention if there is a new spangly widget or screen needed in the application.

This is where automation can come to the rescue. If you have automated testing not only can you can move applications through the development pipeline quicker, but you can also progress infrastructure changes, such as patches and upgrades, much quicker too, as there will be a greater confidence in the outcome of the process.

Conclusion

Patching and upgrades can’t be considering in isolation where automation is concerned. It doesn’t matter how quick and reliably you can patch a database or app server if nobody is ever going to validate it is safe to progress to the next level.

I’m not saying don’t automate patching and upgrades, you definitely should. What I’m saying is it might not deliver on the promise of improved roll-out speed as a chain is only as strong as the weakest link. If testing is the limiting factor in your organisation, all you are doing by speeding up your link in the chain is adding to the testing burden down the line.

Having said all that, at least you will know you stuff is going to work and you can spend your time focusing on other stuff, like maybe helping people sort out their automated testing… 🙂

Cheers

Tim…

Why Automation Matters : Reliability and Confidence

In my previous post on this subject I mentioned the potential for human error in manual processes. This leads nicely into the subject of this post about reliability and confidence…

I’ve been presenting at conferences for over a decade. Right from the start I included live demos in those talks. For a couple of years I avoided them to make my life simpler, but I’ve moved back to them again as I feel in some cases showing something has a bigger impact than just saying it…

The Problem

One of the stressful things about live demos is they require something to run the demo on, and what happens if that’s not in the state you expect it to be?

I had an example of this a few years ago. I was in Bulgaria doing a talk about CloneDB and someone asked me a question at the end of the session, so I trashed my demo to allow me to show the answer to their question. I forgot to correct the situation, so when I came to do the same demo at UKOUG it went horribly wrong, which lead someone on Twitter to say “session clone db is a mess“, and they were correct. It was. The problem here was I wasn’t starting from a known state…

This is no different for us developers and DBAs out in the real world. When we are given some kit, we want to know it’s in a consistent state, but it might not be for a few reasons.

Human Error

The system was created using a manual build process and someone made a mistake. I think almost every system coming out of a manual process has something screwed on it. I make mistakes like this too. The phone rings, you get distracted and you come back to the original task and you forget a step. You can minimise this with recipes and checklists, but we are human. We will goof up, regardless of the measures we put in place.

Sometimes it’s easy to find and fix the issue. Sometimes you have to step through the whole process again to identify the issue. For complex builds this can take a long time, and that’s all wasted time.

Changes During the Lifespan

The delivered system was perfect, but then it was changed during its lifespan. Here are a couple of examples.

App Server: Someone is diagnosing an issue and they change some app server parameters and forget to set them back. Those don’t fix the current issue, but they do affect the outcome of the next test. Having completed the testing successfully, the application gets moved to production and fails, because UAT and Live no longer have the same environment, so the outcomes are not comparable or predictable.

Database: Several developers are using a shared development database. Each person is trying to shape the data to fit their scenario, and in the process trashing someone else’s work. The shared database is only refreshed a handful of times a year, so these inconsistencies linger for a long time. If the setup of test data is not done carefully you can add logical corruptions to the data, making it no longer representative of a real situation. Once again the outcomes are not comparable or predicable.

The Solution?

I guess from the title you already know this. Automation.

Going back to my demo problem again, I almost had a repeat of this scenario at Oracle Code: Bangalore a few months ago. I woke up the day of the conference and did a quick run through my demos and something wasn’t working. How did I solve it? I rebuilt everything. 🙂

I do most of my demos using Docker these days, even for non-Docker stuff. I use Oracle Linux 7 and UEK4 as my base OS and kernel, so I run Docker inside a VirtualBox VM. The added bonus is I get a consistent experience regardless of underlying host OS (Windows, macOS or Linux). So what did the rebuild involve? From my laptop I just ran these commands.

vagrant destroy -f
vagrant up

I subsequently connected to the resulting VM and ran this command to build and run the specific containers for my demo.

docker-compose up

What I was left with was a clean build in exactly the condition I needed it to be to do my demos. Now I’m not saying I wasn’t nervous, because not having working demos on the morning of the conference is a nerve wracking thing, but I knew I could get back to a steady state, so this whole issue resulted in one line in the blog post for that day. 🙂 Without automation I would be trying to find and fix the problem, or manually rebuilding everything under time pressure, which is a sure fire way to make mistakes.

I do some demos on Oracle Database Cloud Service too. When I recently switched between trial accounts my demo VM was lost, so I provisioned a new 18c DBaaS, uploaded a script and ran it. Setup complete.

Confidence

Automation is quicker. I think we all get that. Having a reliable build process means you have the confidence to throw stuff away and build clean at any point. Think about it.

  • Developers replacing their whole infrastructure whenever they want. At a minimum once per sprint.
  • Deployments to environments not just deploying code, but replacing the infrastructure with it.
  • Environments fired up for a single purpose, maybe some automated QA or staff training, then destroyed.
  • When something goes wrong in production, just replace it. You know it’s going to work because it did in all your other environments.

Having reliable automation brings with it a greater level of confidence in what you are delivering, so you can spend less time on unplanned work fixing stuff and focus more on delivering value to the business.

Tooling

The tooling you choose will depend a lot on what you are doing and what your preferences are. For example, if you are focusing on the RDBMS layer, it is unlikely you will choose Docker for anything other than little demos. For some 3rd party software it’s almost impossible to automate a build process, so you might use gold images as your starting point or partially automate the process. In some cases you might use the cloud to provide the automation for you. The tooling is less important than the mindset in my opinion.

Cheers

Tim…

Why Automation Matters : Lost Time

Sorry for stating what I hope is the obvious, but automation matters. It’s mattered for a long time, but the constant mention of Cloud and DevOps over the last few years has thrown even more emphasis on automation.

If you are not sure why automation matters, I would just like to give you an example of the bad old days, which might be the current time for some who are still doing everything manually, with separate teams responsible for each stage of the process.

Lost Time : Handover/Handoff Lag

In the diagram below we can see all the stages you might go through to deploy a new application server. Every time the colour of the box changes, it means a handover to a different team.

So there are a few things to consider here.

  • Each team is likely to have different priorities, so a handover between teams is not necessarily instantaneous. The next stage may be waiting on a queue for a long time. Potentially days. Don’t even get me started on things waiting for people to return from holiday…
  • Even if an individual team has created build scripts and has done their best to automate their tasks, if it is relying on them to pick something off a queue to initiate it, there will still be a handover delay.
  • When things are done manually people make mistakes. It doesn’t matter how good the people are, they will mess up occasionally. That is why the diagram includes a testing failure, and the process being redirected back through several teams to diagnose and fix the issue. This results in even more work. Specifically, unplanned work.
  • Manual processes are just slower. Running an installer and clicking the “Next” button a few times takes longer than just running a script. If you have to type responses and make choices it’s going to take even more time, and don’t forget that bit about human error…

Let’s contrast this to the “perfect” automated setup, where the request triggers an automated process to deliver the new service.

In this example, the request initiates an automated workflow that completes the action and delivers the finished product without any human intervention along the path. The automation takes as long as it takes, and ultimately has to do most of the same work, but there is no added handover lag in this process.

I think it’s fair to say you would be expecting a modern version of this process to complete in a matter of minutes, but I’ve seen the manual process take weeks or even months, not because of “work time”, but because of the idle handover time and human processes involved…

They Took Our Jobs!

At first glance it might seem like this is a problem if you are employed in any of the teams responsible for doing the manual tasks. Surely the automation is going to mean job cuts right? That depends really. In order to fully automate the delivery of any service you are going to have to design and build the blocks that will be threaded together to produce the final solution. This is not always simple. Depending on your current setup this might mean having fewer, more highly skilled people, or it might require more people in total. It’s impossible to know without knowing the requirements and the current staffing levels. Also, cloud provides a lot of the building blocks for you, so if you go that route there may be less work to do in total.

Even if the number of people doesn’t change as part of the automation process, you are getting work through the door quicker, so you are adding value to the business at a higher rate. From a DevOps perspective you have not added value to the business until you’ve delivered something to them. All the hours spent getting part of the build done equate to zero value to the business…

But we are doing OK without automation!

No you’re not! You’re drowning! You just don’t know it yet!

I never hear people saying they haven’t got enough projects waiting. I always hear people saying they have to shelve things because they don’t have time staff/resources/time to do them.

As your processes get more efficient you should be able to reallocate staff to projects that add value to the business, rather than wasting their lives on clicking the “Next” button.

If your process stays inefficient you will always be saying you are short of staff and every new project will require yet another round of internal recruitment or outsourcing.

Is this DevOps?

I’m hesitant to use the term DevOps as it can be a bit of a divisive term. I struggle to see how anyone who understands DevOps can’t see the benefits, but I think many people don’t know what it means, and without the understanding the word is useless…

Certainly automation is one piece of the DevOps puzzle, but equally if you have company resistance to the term DevOps, feel free to ignore it and focus on trying to sell the individual benefits of DevOps, one of which is improved automation…

Cheers

Tim…

APEX 18.1 Docker Builds Updated

You’ve probably seen that APEX 18.1 was released recently. This is just a quick note to say I’ve updated my Docker builds to include the latest versions of all the software including APEX. You can find the builds here.

https://github.com/oraclebase/dockerfiles

I always install APEX into every database, so the database builds include APEX and the ORDS build includes the APEX images.

Remember, I’m not saying you should use these, but if you like to play around with Docker you might find them useful, along with my Docker articles here.

Regardless of how you like to use APEX, get on board with APEX 18.1… 🙂

Cheers

Tim…

Docker and Oracle Databases : Finding the Sweet Spot

One of the questions I’m asked, and indeed have asked myself on numerous occasions, is how do databases fit into the Docker world? More specifically, how does the Oracle database fit into the Docker world?

There is some Boring Context at the end of the post. Happy for you to ignore it, but please read it before giving me aggro. 🙂

I don’t feel the typical lifespan of a production Oracle database fits well into the Docker world. Lots of people, myself included, have been quick to show examples of Oracle databases running on Docker, because it’s really easy and it works, but the examples are all one-off builds. There is little in the way of realistic life-cycle discussed. Maybe I’ve missed the memo on that, or I’m following the wrong people. 🙂

The issues with Oracle on Docker come about when you try to do things “the Docker way”, which works great for application servers and small footprint databases with a simpler approach to patches and upgrades, but not so well for Oracle databases. In my opinion Oracle database patches and upgrades are clumsy from a Docker perspective. I wrote about this here. I know at least one person I respect that disagrees with my opinion on this (you know who you are 🙂 ). You can choose manual intervention in the upgrade process, but I feel like that is removing some of “the magic of Docker”. It’s well within Oracle’s capability to make database patching and upgrades more Docker-friendly, but time will tell if they think it is worth the effort.

From a monitoring and tuning perspective I’m not sure Docker really fits into what most Oracle DBAs are used to either. If Oracle start making DBA tools that are a bit more “Docker aware”, that could change of course. Once again, it depends on their priorities.

So what’s the sweet spot for Oracle databases on Docker? Well I think it’s great from a Dev, Test and QA perspective. I had a great example of this last night. I was doing some APEX patching (5.1.3 to 5.1.4) at work yesterday and hit a problem. I went home and tested the APEX upgrade (5.1.3 to 5.1.4) on four environments and they all worked.

  1. Oracle Database  12.1.0.2 using multitentant architecture with APEX installed in a PDB.
  2. Oracle Database  12.1.0.2 using non-CDB architecture.
  3. Oracle Database  12.2.0.1 using multitentant architecture with APEX installed in a PDB.
  4. Oracle Database  12.2.0.1 using non-CDB architecture.

I never use non-CDB architecture at home anymore, so I had to spend about 30 seconds altering two Dockerfiles for that, but the process of getting clean environments up and running was quick and simple. It was so quick in fact that when I finished I had time to update my database Dockerfiles to install APEX 5.1.4, and update my ORDS Dockerfile to use APEX 5.1.4 and Java 8u161. All built and tested.

I’ve been using virtualization for well over a decade and I have gold images, snapshots and build scripts coming out of my ears, but Docker is head and shoulders above everything else for pushing out these small short-lived clean environments.

A while back I watched an online presentation by Seth Miller about RAC on Docker. I guess some of you are thinking WTF at this point, which was pretty much how I felt before watching it. 🙂 Watch the presentation and see why it makes sense for Veritas to invest effort in this from a Dev/Test perspective. They have a use case that fits.

The important thing is to focus on the use cases where it adds value, rather than the typical “Emperor’s New Clothes” approach assuming it’s all things to all people. Sounds obvious, but it is so often lost on the IT world. 🙂

So in conclusion, I remain sceptical about me ever running an Oracle database on Docker for production, but I can see a bunch of situations where it is useful in the Dev, Test and QA space, where you need relatively small, clean, well-defined environments for short periods of time. That seems like the sweet spot to me at this point, but I reserve the right to change my mind over time. 🙂

I’m really interested to know what you think, especially if you think I’ve missed the point. I am still a Docker newbie. Like I said earlier, read the Boring Context before jumping to conclusions though. 🙂

Cheers

Tim…

Boring Context: I feel I need to make a few points so people don’t pounce on me. 🙂

  • I am talking about databases in the sense of the monolith running on Docker. The microservice approach where the data is self-contained as part of the microservice is different.
  • I’m not trying to make out every database engine has the set of same issues, although I suspect there will be some commonality when you start talking about long-term production usage.
  • There’s nothing wrong with using Docker like lightweight virtualization and doing manual installations, patches, upgrades and maintenance inside your container. It’s going to bloat the hell out of it, and it’s not how I want to use it, but you can choose to do that if you want. The Docker police will not arrest you. 🙂
  • There’s nothing wrong with you going off piste with how you organise your builds and images. You can have separate Dockerfiles per database or use build arguments rather than first-run configuration based on environment variables. Once again, you are going to waste lots of storage and that’s not how I like to play the game, but that is your choice.
  • I know that Docker was developed with application delivery in mind and databases probably weren’t at the forefront of the developers minds when it started. I hope it stays that way!
  • Some of the same issues exist when dealing with complex web applications like OBIEE. Yes, you can install them on Docker for a demo, but does it really work for the normal lifespan of that product?
  • I realise well planned virtualization can have many of the benefits of Docker, but it feels a little heavier in comparison. Each to their own.
  • I realise there are products out there for data virtualization (like Delphix), that are capable of doing some cool things for Dev, Test and QA environments. If you are using something like that, that’s great for you. 🙂
  • Docker is changing fast. My thoughts might be very different over time as Docker develops.
  • I realise there are a lot of “we just need to keep the lights on” databases that are small and don’t have extreme requirements. Maybe I’ll find Docker to be a good home for these. Maybe not.
  • I’m not a Docker hater. In fact I’m quite the opposite.

Docker : My First Steps

In a blog post after OpenWorld I mentioned I might not be writing so much for a while as something at work was taking a lot of my “home time”, which might result in some articles, but then again might not… Well, that something was Docker…

After spending a couple of years saying I was going to start looking at Docker, in June I wrote a couple of articles, put them on the website, but didn’t mention them to anyone.  I was finding it quite hard to focus on Docker because of all the fun I was having with ORDS. More recently it became apparent that we have a couple of use-cases for Docker at work, one of which involved ORDS, so it reignited my interest. There’s nothing like actually needing to use something to make you knuckle down and learn it… 🙂

Having gone back to revisit Docker, I realised the two articles I wrote were terrible, which wasn’t surprising considering how little time I had spent using Docker at that point. The more I used Docker, the more I realised I had totally missed the point. I had come to it with too many preconceptions, mostly relating to virtualization, that were leading me astray. I reached out to a few people (Gerald Venzl, Bruno Borges & Avi Miller) for help and advice, which got me back on track…

I’ve been playing around with Docker a lot lately, which has resulted in a few articles, with some more on the way. I’m not trying to make out I’m “the Docker guy” now, because I’m clearly not. I’m not suggesting you use my Docker builds, because there are better ones around, like these. I’m just trying to learn this stuff and I do that by playing and writing. If other people find that useful and want to follow me on the journey, that’s great. If you prefer to go straight to the source (docs.docker.com) that’s probably a better idea. 🙂

I do a lot of rewrites of articles on my website in general. This is especially true of these Docker articles, which seem to be in a permanent state of flux at the moment. Part of me wanted to wait until I was a little more confident about it all, because I didn’t want to make all my mistakes in public, then part of me thought, “sod it!”

If you want to see what I’ve been doing all the articles are on my website and the Dockerfiles on Github.

I’m having a lot of fun playing around with Docker. You could say, I’m having a “whale” of a time! (I’ll get my coat…)

Cheers

Tim…

ODC Appreciation Day : Silent Installation and Configuration (Automation) : #ThanksODC

Here is my entry for the Oracle Developer Community ODC Appreciation Day (#ThanksODC).

I’ve been mentioning automation a lot recently, both in relation to the cloud and on-prem. The OpenWorld announcements about the Autonomous Database service are not the first thing Oracle has done to ease automation of repetitive tasks. In fact, Oracle has quite a long history of making automation of installation and configuration easy.

I’m not sure what version introduced silent installations of the database, but I first wrote about them when using Oracle 9i (here), with the article changing a lot over the years. In addition to making installations faster, more repeatable and less error prone, they are also important these days if you are using a cloud provider for Infrastructure as a Service (IaaS), since using X emulation to perform tasks can be super-slow. Over the years I’ve also written about silent installations of WebLogic, Oracle Forms, ODI and OBIEE to name but a few.

In addition to installations, Oracle has made silent configuration possible too. Running the Database Configuration Assistant (DBCA) in silent mode is pretty simple (here). WebLogic Scripting Tool (WLST) is a not always easy, but it is a really powerful way to script build processes for WebLogic servers (here). If you are using Enterprise Manager Cloud Control, you will find an API for pretty much everything, allowing you to script using EMCLI (here).

You can find a number of articles I’ve written related to silent installation and configuration using the links above, or grouped under this section of my website.

A good knowledge of this subject is important if you want to start checking out Docker, because you will be doing silent builds and configuration for everything.

When you are learning something new it is nice to use GUI screens, as they often feel a little simpler at first and sometimes give you a little more context about what you are doing. Once you’ve covered the basics you should really switch to scripting, as it will make you more efficient. When I first started to manage WebLogic servers I resisted the switch to using WLST for quite some time. It seemed a little complicated and I was in denial until Lonneke Dikmans persuaded me to try it. Once I got into it I never looked back! 🙂

To summarise the advantages of scripting your installations and configuration, they are:

  • Faster.
  • More reliable.
  • More repeatable.
  • Work fine on the cloud and in Docker.
  • Easily maintainable and can be version controlled.

If you’re not using this stuff already, do yourself a favour and give it a go. You will thank yourself!

Cheers

Tim…