PXE Installations on RHEL6 / OL6

I spent yesterday neatening up a few old articles. For the most part it is a bit of a dull process, but it has to be done every so often.

With what’s going on at work, it seemed like a good idea bring my old Kickstart and PXE Installation articles up to date. My kickstart article was written in the RHEL3 era which needed bringing up to date. Nothing has really changed about the process, but some new screen shots from OL6 make it look a little fresher. My old PXE Installation article was written against RHEL5/OL5, so I figured things wouldn’t have changed much between that and RHEL6/OL6… Wrong! I ended up having to write a new article specifically for PXE Installations on RHEL6/OL6.

I think that’s enough of me pretending to be a Linux sysadmin for a while… 🙂

Cheers

Tim…

scsi_id and UDEV issues (update)…

Last month I wrote about a problem I saw with scsi_id and UDEV in  OL5.8. As it screwed up all my UDEV rules is was a pretty important issue for me. It turned out this was due to a mainline security fix (CVE-2011-4127) affecting the latest kernels of both RHEL/OL5 and RHEL/OL6. The comments on the previous post show a couple of workarounds.

Over the weekend I started to update a couple of articles that mentioned UDEV rules (here and here) and noticed the problem had dissapeared. I updated two VMs (OL5.8 and OL6.2) with the latest changes, including the UEK updates and ran the tests again and here’s what I got.

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.8 (Tikanga)
# uname -r
2.6.39-100.6.1.el5uek
# scsi_id -g -u -s /block/sda/sda1
SATA_VBOX_HARDDISK_VB535d493d-7a44eb0f_
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
# uname -r
2.6.39-100.6.1.el6uek.x86_64
# /sbin/scsi_id -g -u /dev/sda1
1ATA_VBOX_HARDDISK_VB2b5dc561-4ae6e154
#

So it looked like normal service had been resumed. 🙂 Unfortunately, the MOS Note 1438604.1 associated with this issue is still not public, so I couldn’t tell if this was a unilateral change in UEK, or part of a mainline fix for the previous change.

To check I fired up a CentOS 6.2 VM with the latest kernel updates and switched an Oracle Linux VM to the latest RHEL compatible kernel and did the test on both. As you can see, they both still don’t report the scsi_id for partitions.

# cat /etc/redhat-release
CentOS release 6.2 (Final)
# uname -r
2.6.32-220.13.1.el6.x86_64
# /sbin/scsi_id -g -u /dev/sda1
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
# uname -r
2.6.32-220.13.1.el6.x86_64
# /sbin/scsi_id -g -u /dev/sda1
#

It could be the associated fix has not worked through the mainline to RHEL and CentOS yet. I’ll do a bit of digging around to see what is going on here.

Cheers

Tim…

Update: It appears the reversion of this functionality may not be permanent, so I’ve updated my articles to use a “safer” method of referencing the parent (disk) device, rather than the partition device.

Oracle Database Certified on OL6/RHEL6 (at last)…

I can hardly believe it. It’s finally happened!!!

Check out the story here.

The certification matrix on MOS is not updated yet, and those on RHEL kernel will have to wait a few more days (90), but at last we have some firm commitment. 🙂

From now on, the Oracle Linux errata are available free from http://public-yum.oracle.com. In the past only the updates (5.6, 5.7 etc.) were available. This makes OL even more useful than before.

Thank you!

Cheers

Tim…

Update: Remember, if you apply the errata to OL6.2, you will have the same scsi_id issue I saw with 5.8.

Oracle Linux 5.8 and UDEV issues…

I just did an update from Oracle Linux 5.7 to 5.8 on one of my VirtualBox RAC installations and things are not looking to clever at the moment. After a reboot, the ASM instances and therefore the database instances wouldn’t restart. A quick look showed the ASM disks were not visible. On this installation I was using UDEV, rather than ASMLib. In checking the UDEV rules I noticed the scsi_id command on OL5.8 doesn’t report an ID for partitions on disks, only the disks themselves. For example, on OL5.7 I get this,

# /sbin/scsi_id -g -u -s /block/sdb/sdb1
SATA_VBOX_HARDDISK_VBd306dbe0-df3367e3_
#

On OL5.8 I get this,

# /sbin/scsi_id -g -u -s /block/sdb/sdb1
#

If I run it against the disk, rather than the partition it works fine.

This has literally just happened, so I’ve done no further investigation, but I thought it was worth putting out there in case anyone was about to start an OS update on something they cared about. 🙂

At this point I’m not discounting that I’ve screwed up somewhere. My next plan is to install three clean VMs (OL 5.6, 5.7 and 5.8) and check the output of scsi_id on each of them. If that turns out OK, then I’ve screwed something else and you can probably ignore this post. I might not get to try it out until tomorrow. Either way, I’ll update this post with the results of that test.

Cheers

Tim…

Update 1: It’s definitely changed. See the following.

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.6 (Tikanga)
# /sbin/scsi_id -g -u -s /block/sda/sda1
SATA_VBOX_HARDDISK_VB54dff07f-931ce4d7_
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.7 (Tikanga)
# /sbin/scsi_id -g -u -s /block/sda/sda1
SATA_VBOX_HARDDISK_VBx180d717-f896e661_
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.8 (Tikanga)
# /sbin/scsi_id -g -u -s /block/sda/sda1
#

Update 2: As John Sobecki correctly pointed out in the comments, the title of the post is misleading. UDEV is not at fault here. The problem is the “/sbin/scsi_id” command is behaving differently, which is making my rules useless. The UDEV issue is the symptom, not the cause. The post is clearly focusing on the scsi_id issue, but I’ve picked a pretty bad title to go with it. 🙂

Update 3: John Sobecki pointed me at “[block] fail SCSI passthrough ioctls on partition devices CVE-2011-4127”, a mainline kernel security fix that seems to be the cause of this. It affects all new kernels which include this change (RHEL5/6, UEK etc). Oracle are testing the impact of this. Initially ASMLib and OCFS seem unaffected.

Update 4: MOS Note 1438604.1 (currently in review) contains more information about this issue. ASMLib and OCFS are unaffected by CVE-2011-4127, so ASMLib should probably be used in preference to UDEV with newer kernels.

Update 5: I’ve altered all the articles on my site to reference the parent (disk) device, rather than the partition device, which makes the UDEV rules work fine again. Thanks to Bryan Wood and Joachim for their suggestions.

What if Oracle 11gR2 never gets certified on RHEL6/OL6?

I’ve been involved in a number of blog comment, email and twitter exchanges over the last few months about the 11gR2 on RHEL6/OL6 certification issue.

The last time I blogged specifically about it was in October and it’s now over 6 months since Red Hat completed their part in the certification of 11gR2 on RHEL6, yet still no news.

In the course of these conversations I’ve come across a number of ridiculous conspiracy theories, as well as statements from people who know a hell of a lot more about Oracle platform certification than me. It’s worth saying at this point that none of the sources of these ideas are current Oracle employees, so they are not privy to “inside” information. Same goes for me. I’m just another person trying to figure out what is going on.

Here are some of the points from the last few months that stand out to me:

  • Oracle software working on a platform and certifying it on that platform are not the same thing.
  • Platform certification is a labor intensive operation, most of which is the responsibility of the platform vendor.
  • Even though RH have completed their part of the RHEL6 certification process, Oracle have not done the same for OL6. Oracle will *never* let RHEL6 be certified if OL6 is not.
  • Certification of Oracle on OL6 will have an impact on all Oracle appliances and engineered systems currently on sale. All of these systems currently use OL5.x. How long after certification will customers start demanding an OS upgrade?
  • Oracle have no pressing need to certify RHEL6/OL6, since all the performance improvements of the RHEL6 kernel are already in the OL5.x UEK. Oracle are a business and why throw resources certifying an “old” version of the database on a “new” platform when a new Oracle version is just around the corner.
  • Distro version is unimportant on an Oracle server. The kernel is the biggest factor. Most of the software in a Linux distro is useless guff as far as an Oracle installation is concerned. Do you really care what the version of the browser or LibreOffice ships with your server OS?
  • Oracle 12c is currently in beta. The rumor is it will be announced/released at OOW12. Once it is released Oracle will have to go into overdrive to make sure it is certified on all the important platforms and presumably shipping on all their appliances and engineered systems. That is going to be a mammoth task. Do you really see them wasting time on 11gR2 at this point in the DB lifecycle?
  • The support cycle for RHEL and OL has increased to 10 years, so there is no pressing need to upgrade your OS from a support perspective.

Of course, nobody on the outside really knows what is going on and I imagine anyone on the inside would be looking for a new job if they let slip. From this point on I will follow the advice of people far more qualified than me and assume that “Oracle 11gR2 will never be certified on RHEL6/OL6”. If by some fluke it does happen, then it will be a happy surprise.

To end this depressing post on a lighter note, this is one of my recent tweets on the subject…

Cheers

Tim…

PS. I purposely didn’t attribute names to these points. Not everyone wants to be outed to the world, especially when their opinions were expressed via email.

Update: It’s finally certified. See here.

Oracle Database 11gR2 on OL6 / RHEL6: Certified or Not?

There seems to be a little confusion out there about the certification status of Oracle Database 11gR2, especially with the release of the 11.2.0.3 patchset which fixes all the issues associated with RAC installs on OL/RHEL 6.1.

Currently, 11gR2 is *NOT* certified on OL6 or RHEL6. How do I know? My Oracle Support says so! Check for yourself like this:

  • Log on the My Oracle Support (support.oracle.com).
  • Click the “Certifications” link.
  • Type in the product name, like “Oracle Database”
  • Select the product version number, like “11.2.0.3.0”.
  • Select the platform, like “Linux x86_64” or a specific distro beneath this.
  • Click the “Search” button.

From the results you will see that Oracle Database 11.2.0.3 is certified on OL and RHEL 5.x. Oracle do not differentiate between different respins of the major version. You will also notice that it is not currently supported on OL6 or RHEL6.

Having said that, we can expect this certification really soon. Why? Because Red Hat has submitted all the certification information to Oracle and (based on previous certifications) expects it to happen some time in Q4 this year, which is any time between now and the end of the year.

With a bit of luck, by the time I submit this post MOS certification will get updated and I will happily be out of date… 🙂

Cheers

Tim…

Update: It’s finally certified. See here.