scsi_id and UDEV issues (update)…

Last month I wrote about a problem I saw with scsi_id and UDEV in  OL5.8. As it screwed up all my UDEV rules is was a pretty important issue for me. It turned out this was due to a mainline security fix (CVE-2011-4127) affecting the latest kernels of both RHEL/OL5 and RHEL/OL6. The comments on the previous post show a couple of workarounds.

Over the weekend I started to update a couple of articles that mentioned UDEV rules (here and here) and noticed the problem had dissapeared. I updated two VMs (OL5.8 and OL6.2) with the latest changes, including the UEK updates and ran the tests again and here’s what I got.

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.8 (Tikanga)
# uname -r
2.6.39-100.6.1.el5uek
# scsi_id -g -u -s /block/sda/sda1
SATA_VBOX_HARDDISK_VB535d493d-7a44eb0f_
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
# uname -r
2.6.39-100.6.1.el6uek.x86_64
# /sbin/scsi_id -g -u /dev/sda1
1ATA_VBOX_HARDDISK_VB2b5dc561-4ae6e154
#

So it looked like normal service had been resumed. :) Unfortunately, the MOS Note 1438604.1 associated with this issue is still not public, so I couldn’t tell if this was a unilateral change in UEK, or part of a mainline fix for the previous change.

To check I fired up a CentOS 6.2 VM with the latest kernel updates and switched an Oracle Linux VM to the latest RHEL compatible kernel and did the test on both. As you can see, they both still don’t report the scsi_id for partitions.

# cat /etc/redhat-release
CentOS release 6.2 (Final)
# uname -r
2.6.32-220.13.1.el6.x86_64
# /sbin/scsi_id -g -u /dev/sda1
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
# uname -r
2.6.32-220.13.1.el6.x86_64
# /sbin/scsi_id -g -u /dev/sda1
#

It could be the associated fix has not worked through the mainline to RHEL and CentOS yet. I’ll do a bit of digging around to see what is going on here.

Cheers

Tim…

Update: It appears the reversion of this functionality may not be permanent, so I’ve updated my articles to use a “safer” method of referencing the parent (disk) device, rather than the partition device.

Oracle Linux 5.8 and UDEV issues…

I just did an update from Oracle Linux 5.7 to 5.8 on one of my VirtualBox RAC installations and things are not looking to clever at the moment. After a reboot, the ASM instances and therefore the database instances wouldn’t restart. A quick look showed the ASM disks were not visible. On this installation I was using UDEV, rather than ASMLib. In checking the UDEV rules I noticed the scsi_id command on OL5.8 doesn’t report an ID for partitions on disks, only the disks themselves. For example, on OL5.7 I get this,

# /sbin/scsi_id -g -u -s /block/sdb/sdb1
SATA_VBOX_HARDDISK_VBd306dbe0-df3367e3_
#

On OL5.8 I get this,

# /sbin/scsi_id -g -u -s /block/sdb/sdb1
#

If I run it against the disk, rather than the partition it works fine.

This has literally just happened, so I’ve done no further investigation, but I thought it was worth putting out there in case anyone was about to start an OS update on something they cared about. :)

At this point I’m not discounting that I’ve screwed up somewhere. My next plan is to install three clean VMs (OL 5.6, 5.7 and 5.8) and check the output of scsi_id on each of them. If that turns out OK, then I’ve screwed something else and you can probably ignore this post. I might not get to try it out until tomorrow. Either way, I’ll update this post with the results of that test.

Cheers

Tim…

Update 1: It’s definitely changed. See the following.

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.6 (Tikanga)
# /sbin/scsi_id -g -u -s /block/sda/sda1
SATA_VBOX_HARDDISK_VB54dff07f-931ce4d7_
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.7 (Tikanga)
# /sbin/scsi_id -g -u -s /block/sda/sda1
SATA_VBOX_HARDDISK_VBx180d717-f896e661_
#

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.8 (Tikanga)
# /sbin/scsi_id -g -u -s /block/sda/sda1
#

Update 2: As John Sobecki correctly pointed out in the comments, the title of the post is misleading. UDEV is not at fault here. The problem is the “/sbin/scsi_id” command is behaving differently, which is making my rules useless. The UDEV issue is the symptom, not the cause. The post is clearly focusing on the scsi_id issue, but I’ve picked a pretty bad title to go with it. :)

Update 3: John Sobecki pointed me at “[block] fail SCSI passthrough ioctls on partition devices CVE-2011-4127″, a mainline kernel security fix that seems to be the cause of this. It affects all new kernels which include this change (RHEL5/6, UEK etc). Oracle are testing the impact of this. Initially ASMLib and OCFS seem unaffected.

Update 4: MOS Note 1438604.1 (currently in review) contains more information about this issue. ASMLib and OCFS are unaffected by CVE-2011-4127, so ASMLib should probably be used in preference to UDEV with newer kernels.

Update 5: I’ve altered all the articles on my site to reference the parent (disk) device, rather than the partition device, which makes the UDEV rules work fine again. Thanks to Bryan Wood and Joachim for their suggestions.