Database Patching : It’s a difficult subject

If you came hear hoping I was going to say there are valid reasons not to patch, you are out of luck. There is never a valid reason not to patch…

Instead this post is more about the general approach to patching. I’ve spent 22+ years writing about Oracle, including how to install it, but I’ve written practically nothing about how to patch a database. My stock answer is “read the patch notes”, and to be honest that is probably the best thing anyone can do. Although patching is a lot more standardized these days, it’s still worth reading the patch notes in case something unexpected happens. In this post I just want to talk about a few top-level things…

Patching to a new ORACLE_HOME

There are two big reasons for patching to a new ORACLE_HOME, or out-of-place patching.

  1. You can apply the binary patches to the new home while the database is still running in the old home, so you reduce the total amount of downtime.
  2. You have a natural fallback in the event of the wanting to revert the patch. You don’t have to wait for the patch rollback to complete.

There are some downsides though.

  1. It requires extra space to hold both the unpatched and patched homes, until you reach a point where you are happy to remove the unpatched home.
  2. If you have any scripts that reference the ORACLE_HOME, they will need to be updated. Hopefully you’ve centralized this into a single environment setup script.
  3. I guess it’s a little more complicated, and the patch notes are not that helpful.

So should you follow the recommendation of patching to a new home or not? The answer as always is “it depends”.

The reduction in downtime for a single instance database is good, but if you are running RAC or Data Guard, this isn’t really an issue as the database remains online for most of the patching anyway. Having a quick fallback is great, but once again if you are running RAC or Data Guard this isn’t a big deal.

If you are running without RAC or Data Guard, you have made a decision that you can tolerate a certain level of downtime, so is taking the system down for an hour every quarter that big a deal? I’ve heard of folks who use RAC and/or Data Guard who still bring the whole system offline to patch, so the decision is probably going to be very different for people, depending on their environment and the constraints they are working with.

I hope you’re taking OS and database backups before patching. If something catastrophic happens, such that a rollback of the patch is not possible, you can recover your original home and database from the backups. Clearly this could take a long time, depending on how your backups are done, but the risk of loss is low. So the question is, can you tolerate the additional downtime?

You have to make a decision on the pros and cons of each approach for you, and of course deal with the consequences. If in doubt, go with the recommendation and patch to a new home.

Read-only Oracle homes

Read-only Oracle homes were introduced in 18c (here) as an option, and are the default from Oracle 21c onward. One of the benefits of read-only Oracle homes is they make switching homes so much easier. You haven’t got to worry about copying configuration files between homes, as they are already located outside the home.

Release Update (RU) or Release Update Revision (RUR)?

You have a choice between patching using a Release Update (RU), or a Release Update Revision (RUR). To put it simply, a RU contains not only the latest security patches and regression fixes, but may also include additional functionality, so the risk of introducing a new bug is higher. A RUR is just the security patches and regression fixes. Unlike the Critical Patch Updates (CPUs) of the past, that ran on endlessly, RURs are tied to specific RUs, so you will end up applying the RUs, but at a later date, when hopefully the bugs have been sorted by the RUR…

The folks at Oracle suggest applying the RUs, which is what I (currently) do. Some in the Oracle community suggest applying RURs is the safer strategy. If you look at the “Known Issues” for each RU, and the list of recommended one-off patches that should be applied after the RU, you can see why some people are nervous of going directly to RUs.

Once again, this comes down to you and your experience of patching with the feature set you use. If you are finding RUs are too problematic, go with the RUR approach. You can always change your mind at any time…

Monthly Recommended Patches (MRPs)

There’s a new kid on the block starting with 19.17 on Linux, which are monthly recommended patches (MRPs). They replace RURs. There are 6 MRPs per RU, with each MRP containing the RU and the current batch of recommended one-off patches, as documented in MOS Note 555.1.

I’m assuming these are rolling and standby-first patches, but I can’t confirm that yet.

RAC Patching : Rolling Patches

Rolling patches can be applied one node at a time, so there are always database instances running, which means the database remains available for the whole of the patching process.

Release Updates (RUs) and Release Update Revisions (RURs) are always rolling patches, so it makes sense to take advantage of this approach. If you are applying one-off patches, these may not be rolling patches, so always check the patch notes to make sure.

Even when rolling patches are available, you can still make the decision to take the whole system offline to apply the patches. I’m not sure why you would want to do this, but the option is there for you.

Data Guard : Standby-First Patches

Release Updates (RUs) and Release Update Revisions (RURs) are always standby-first patches. This gives you some flexibility on how you approach patching your system. Here are two scenarios with a two node Data Guard setup, where node 1 is the primary and node 2 is the standby.

Scenario 1 : Switchovers

  • Patch the node 2 binaries (not datapatch) and bring the standby back into recovery mode.
  • Switchover roles, making the node 2 the primary and node 1 the standby.
  • Patch the node 1 binaries (not datapatch) and bring the standby back into recovery mode.
  • Run datapatch against node 2 (the primary database).
  • Optionally switchover roles making node 1 the primary database again.

Scenario 2 : No switchovers

  • Patch the node 2 binaries (not datapatch), but don’t start the standby.
  • Patch the node 1 binaries (not datapatch) and start the database.
  • Start the standby on node 2.
  • Run datapatch on node 1 (the primary).

Scenario 1 reduces downtime, as the primary is always running while the standby is having the binaries patched. Scenario 2 is simpler, but has a more extensive downtime as the primary is out of action while the binaries are being patched.

Remember, one-off patches may not be standby-first patches, so you may only have the option of scenario 2 when applying them. You have to read the patch notes.

OJVM Patching : Which approach?

Oracle 19c and 21c have simplified the OJVM patching situation. Previously the OJVM patches were completely separate. The grid infrastructure (GI) and database patches now include the OJVM patches from the previous quarter, and the latest OJVM patches are still shipped as a separate patch. So now we have a choice.

  • Only apply the GI/DB patches, and remain one quarter behind on the OJVM patches.
  • Apply both the GI/DB and OJVM patches and be up to date on both.

The separate OJVM patches come with additional restrictions. They are not standby-first patches, and according to the patch notes, they can only be applied as RAC rolling patches if you use out-of-place patching.

Personally I would say just apply the GI/DB patches and forget about the separate OJVM patches. Life is much simpler that way, and being three months behind on the OJVM patches is not the worst thing in the world.

Why don’t you write about patching much?

Writing about patching is difficult, because everyone has a unique environment, and their own constraints placed on them by their business. I’ve always avoided writing too much about patching because I know it’s opening myself up for criticism. Whatever you say, someone will always disagree because of their unique situation, or demand yet another patching scenario because of their unique environment. You’re damned if you do, and damned if you don’t.

I’ve recently written a few patching articles for specific scenarios (here). I may add some more, but it’s not going to be a complete list, and don’t expect me to write articles about stuff I don’t use, like Exadata. These are purely meant as inspiration for new people. Ultimately, you need to read the patch notes and decide what is best for you!

Let the cloud do it!

If all this is too much hassle, you do have the option of moving your database to the cloud and letting them worry about patching it. πŸ™‚

Conclusion

Read the patch notes!

Cheers

Tim…

Database Patching Revisited : Take off and nuke the entire site from orbit…

I was reading a post by Pete Finnigan the other day.

I put out a tweet mentioning it, and linking to one of my old posts on the subject too.

This started a bit of a debate on Twitter about how people patch their databases. In this post I want to touch on a few points that came out Pete’s post an some of the other Twitter comments.

You have to have a plan!

An extremely important point made by Pete was you have to have a plan. That doesn’t have to be the same for everyone, and there may be compromises due to constraints in your company, but that doesn’t stop you making a plan. Your plan might be:

  • We will start a new round of patching immediately when a new on-off patch is released, and every quarter with the security announcements. I can’t see how this is possible.
  • We will patch every quarter with the security announcements. That’s what my company does.
  • We will patch once per (six months, year etc.)

Hopefully your plan will not be:

  • We will never patch and person X will take the blame when we have a problem.

Release Updates (RUs) or Release Update Revisions (RURs)

Database quarterly patches are classified as release updates (RUs) and release update revisions (RURs). First let’s explain what they are.

  • Release Updates (RUs) : These are like the old proactive bundle patches. They contain bug fixes, security fixes and limited new features. Let’s call that “extra stuff”. In 19c the blockchain tables and immutable tables features were introduced in RUs. Backporting and new features can introduce new risks.
  • Release Update Revisions (RURs) : These are just bug fixes and security fixes. In theory these are safer than RUs as less new stuff is introduced, but… See below.

So from first glance you are saying to yourself I want the safest option, so I want to go for RURs. The problem is RURs aren’t like the old security patches that you could continue applying forever. Ultimately you have to include all the “extra stuff” from the previous RUs, but you get the option of doing it later. This page in the documentation explains things quite well.

This table from that link is quite useful, showing you what version you will be on during a quarterly patching cycle.

What does this mean?

  • If you patch using the RUs, you are going to the latest and greatest each quarter.
  • If you use RUR-1, you are constantly 1 quarter behind on the RUs extra content, but you add in the missing bug fixes and security fixes using the RUR-1 patch.
  • If you use RUR-2, you are constantly 2 quarters behind on the RUs extra content, but you add in the missing bug fixes and security fixes using the RUR-2 patch.

In all cases you have the latest bug fixes and security fixes. You are just delaying getting the “extra bits”. So at first glance it seems like you might as well go with the RUs. The issue is some of the RUs are a bit buggy. If you go for the RUR-1 or RUR-2 there is a chance the bugs introduced in the base RU have been fixed in the subsequent RURs for that RU. So we could say this.

  • RUs: Oracle have zero time to identify and fix the bugs they’ve introduced in the RU.
  • RUR-1: Oracle have 3 months to find and fix the bugs they’ve introduced in the base RU.
  • RUR-2: Oracle have 6 months to find and fix the bugs they’ve introduced in the base RU.

I tend to stick with the RUs, although I am considering changing. Ilmar Kerm said he’s found RUs too buggy and tends to stick with the RUR-1 approach. I guess a more conservative approach would be to stick with the RUR-2 approach.

Your experience of the RUs verses the RURs will depend on what features you use, what extra stuff Oracle decide to include in the RU and what they break by including that extra stuff. The biggest problem I got was 19.10 breaking hot-cloning of PDBs, which was kind-of important. If I had used the RUR-1 approach I would never have seen that issue. Different people using different features see different bugs.

How good is your testing?

The biggest factor in the decision of which approach to take is probably the quality of your testing.

  • If your testing of applications against new patches is good, you can probably stick with the RUs. If the RU fails testing, go with the RUR-1 that quarter.
  • If you just work on the “generally considered safe” approach, meaning you apply the patches and don’t do any testing, maybe you should be using the RUR-1 or RUR-2 approach!
  • The ultra-conservative approach would be to stick with the RUR-2 approach.

Just patch!

Regardless of which approach you take, you’ve got to have a plan, and you should be patching. I know some of you don’t care about patching, and you are fools. I know some of you would like to patch, but your companies are dinosaurs. All I can say to you is keep trying.

In my current company we never used to patch. I spent years sending out quarterly reports summarising all the vulnerabilities in our systems and still nothing. Eventually a few other people jumped on the bandwagon, we had a couple of embarrassing issues, and the constant threat of GDPR gave us some more leverage. Now we have a quarterly patching schedule for all our databases and middle tier servers. We are not perfect, but it can be done.

Even now, we still have questions like, “can we miss out this quarter?”, but we push back very hard against this. One quarter becomes two, becomes three, becomes never.

New patches on the 20th July (see here). Good luck everyone!

Cheers

Tim…

PS. If you are not patching externally facing WebLogic servers you might as well close your company now. You have already given all your data away. Good luck with that GDPR fine…

Why Automation Matters : Patching and Upgrading

As I said in a recent post, you know you are meant to, but you don’t. Why not?

The reasons will vary a little depending on the tech you are using, but I’ll divide this answer into two specific parts. The patch/upgrade process itself and testing.

The Patch/Upgrade Process

I’ve lived through the bad old days of Oracle patching and upgrades and it was pretty horrific. In comparison things are a lot better these days, but they are still not what they should be in my opinion. I can script patches and upgrades, but I shouldn’t have to.  I’m sure this will get some negative feedback, but I think people need to stop navel gazing and see how simple some other products are to deal with. I’ll stop there…

That said, I don’t think patches and upgrades are actually the problem. Of course you have to be careful about limiting down time, but much of the this is predictable and can be mitigated.

One of the big problems is the lack of standardisation within a company. When every system is unique, automating a patch or upgrade procedure can become problematic. You have to include too much logic in the automation, which can make the automation a burden. What the cloud has taught us is you should try to standardise as much as possible. When everything most things are the same, scripting and automation gets a lot easier. How do you guarantee things conform to a standard? You automate the initial build process. πŸ™‚

So if you automate your build process, you actually make automating your patch/upgrade process easier too. πŸ™‚

The app layer is a lot simpler than the database layer, because it’s far easier to throw away and replace an application layer, which is what people aim to do nowadays.

Testing

Testing is usually the killer part of the patch/upgrade process. I can patch/upgrade anything without too much drama, but getting someone to test it and agree to moving it forward is a nightmare. Spending time to test a patch is always going to lose out in the war for attention if there is a new spangly widget or screen needed in the application.

This is where automation can come to the rescue. If you have automated testing not only can you can move applications through the development pipeline quicker, but you can also progress infrastructure changes, such as patches and upgrades, much quicker too, as there will be a greater confidence in the outcome of the process.

Conclusion

Patching and upgrades can’t be considering in isolation where automation is concerned. It doesn’t matter how quick and reliably you can patch a database or app server if nobody is ever going to validate it is safe to progress to the next level.

I’m not saying don’t automate patching and upgrades, you definitely should. What I’m saying is it might not deliver on the promise of improved roll-out speed as a chain is only as strong as the weakest link. If testing is the limiting factor in your organisation, all you are doing by speeding up your link in the chain is adding to the testing burden down the line.

Having said all that, at least you will know your stuff is going to work and you can spend your time focusing on other stuff, like maybe helping people sort out their automated testing… πŸ™‚

Check out the rest of the series here.

Cheers

Tim…

The patching nightmares are over (11.2.0.2.0)…

One of the things that continually annoys me is that to get the latest version of the database you have to install the base release and then instantly patch it to the latest patch set. Not any more.

“Starting with the 11.2.0.2 patch set, Oracle Database patch sets are full installations of the Oracle Database software. This means that you do not need to install Oracle Database 11g Release 2 (11.2.0.1) before installing Oracle Database 11g Release 2 (11.2.0.2).”

You don’t understand how happy this makes me. In addition, the installer also downloads and applies madatory patches, so even when you’re mid-way through the lifecycle of a patchset, your new installations are still up to date. πŸ™‚

There is a bunch of new functionality already listed in the new features manual:

Happy downloading and upgrading.

Notes.

  • Read the patch notes before you start downloading. You probably don’t need all the zip files (4.8G). πŸ™‚
  • Out-of-place patching (new ORACLE_HOME) is the recommended method now, so there is no real difference between patch sets and upgrades. Grid infrastructure *must* be patches out-of-place.
  • I guess OFA directories should now include the first 4 digits of the version (11.2.0 -> 11.2.0.2) as those directories will only ever contain that patch set.

Cheers

Tim…