Decoding device names from vmunix errors in syslog

Your HP-UX server has a failed local drive.  Luckily you are using MirrorDisk/UX, so the problem is more annoying than anything.  You find the error because of entries like the following in syslog.log:

Jan 13 02:50:22 hostname vmunix: SCSI: Async write error -- dev: b 31 0x022000, errno: 126, resid: 8192,
Jan 13 02:50:22 hostname vmunix:   blkno: 45699672, sectno: 91399344, offset: 3846791168, bcount: 8192.
Jan 13 02:50:22 hostname vmunix:   blkno: 45699128, sectno: 91398256, offset: 3846234112, bcount: 8192.
Jan 13 02:50:22 hostname vmunix: SCSI: Read error -- dev: b 31 0x022000, errno: 126, resid: 1024,
Jan 13 02:50:22 hostname vmunix: SCSI: Async write error -- dev: b 31 0x022000, errno: 126, resid: 8192,
Jan 13 02:50:22 hostname vmunix:   blkno: 8, sectno: 16, offset: 8192, bcount: 1024.
Jan 13 02:50:22 hostname vmunix: LVM: VG 64 0x000000: PVLink 31 0x022000 Failed! The PV is not accessible.
Jan 13 02:50:22 hostname vmunix:
Jan 13 02:50:22 hostname vmunix: LVM: VG 64 0x000000: PVLink 31 0x022000 Recovered.
Jan 13 03:06:32 hostname vmunix: LVM: Failed to automatically resync PV 1f022000  error: 5
Jan 13 03:14:07 hostname vmunix: LVM: Failed to automatically resync PV 1f022000  error: 5
Jan 13 03:26:57 hostname vmunix: LVM: Failed to automatically resync PV 1f022000  error: 5
Jan 13 03:32:45 hostname vmunix: LVM: Failed to automatically resync PV 1f022000  error: 5
Jan 13 03:50:19 hostname vmunix: LVM: Failed to automatically resync PV 1f022000  error: 5
Jan 13 03:55:19 hostname vmunix: LVM: Failed to automatically resync PV 1f022000  error: 5
Jan 13 03:59:27 hostname vmunix: LVM: Failed to automatically resync PV 1f022000  error: 5

ioscan is showing all hard drives as claimed and vgdisplay -v is showing all PVs as available. How can you tell which of your drives is dieing?

The syslog errors above are identifying the device throwing the errors in terms of major and minor numbers. First let’s look at the SCSI errors:

SCSI: Async write error -- dev: b 31 0x022000

This is saying that the device in question is a block device (dev: b 31 0x022000), the major number is decimal 31 (dev: b 31 0x022000), and the minor number is hexadecimal 022000 (dev: b 31 0x022000). By greping a long directory listing of /dev/dsk for that minor number we find:

$ ll /dev/dsk | grep 022000
brw-r-----   1 bin        sys         31 0x022000 Feb  9  2003 c2t2d0

The erroring disk is c2t2d0.

The LVM errors are very similar. For example:

LVM: Failed to automatically resync PV 1f022000

The main difference here is that the major and minor numbers are squished together and the whole thing is shown in hexadecimal (1F in hex is 31 in dec).

 

Update – March 2, 2011 at 09:31:

The LVM errors also identify the affected volume group via major and minor number. For example:

LVM: VG 64 0x000000: PVLink 31 0x022000 Failed! The PV is not accessible.

In this error, 64 (VG 64 0x000000) is the major number for volume groups and 0x000000 (VG 64 0x000000) is the minor number for vg00.

Advertisements

One Response to Decoding device names from vmunix errors in syslog

  1. Rex Kirkland says:

    Thank you. The info is very helpful. We are running into exactly the same scenario at a remote site.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: