Friday, May 16, 2008

Symantec netbackup on Solaris 10

My current backup server is an E450 (I know...control your covetousness) running solaris 10 with symantec netbackup enterprise server on it attached to a robot. When I started all of this I was running 6.0. During the process of figuring out what was going on with the server I upgraded to MP6. I was doing a lot of stuff at the same time, which complicated things later (yes...I know...shut up) but this is what happened and how it was resolved.

I used updatemanager to perform an update. One of the patches was 120011-14 which was a manual only kernel patch. I also got out a standalone tape drive to add to my robot so I could start doing offsite backups in a more reliable/organized way.

After all the patches were done, I rebooted the server and when it came up it no longer recognized the robot. After a ton of looking around we determined that it wasn't generating the SG drivers because NetBackup wasn't recognizing that the robot was there. I could probe it from the boot prompt, I could write to it and read from it (well, the tape drive in it) from the command line, but the NetBackup software just would not recognize it.

So after doing all of the troubleshooting I knew how to do for it, and all I had time to do with a consultant I work with at times, I ended up spending a couple of hours a day several days in a row on progressively higher level tech support with Symantec. Here's the things we did to try to regenerate the SG drivers. I must have gone through various versions of this 20 times or so all with no success.

(DO NOT PERFORM THESE COMMANDS WITHOUT KNOWING WHAT YOU ARE DOING! )
cp /kernel/drv/st.conf /kernel/drv/st.conf.`date +%m%d%y_%H%M%S`
cp /kernel/drv/st.conf /kernel/drv/st.conf.`date +%m%d%y_%H%M%S`
cp /kernel/drv/st.conf /kernel/drv/st.conf.`date +%m%d%y_%H%M%S`
mv /kernel/drv/sg.conf /kernel/drv/sg.conf.`date +%m%d%y_%H%M%S`
cp /etc/devlink.tab /etc/devlink.tab.`date +%m%d%y_%H%M%S`
cd /kernel/drv/
vi st.conf
cd /etc
vi devlink.tab
cd /usr/openv/volmgr/bin/driver
../sg.build all -mt 15 -ml 2
cat st.conf >> /kernel/drv/st.conf
rem_drv sg
./sg.install
sgscan


At this point I was waiting for 3rd level or 4th level tech support to call back and it just kept bugging me that I'd done a kernel update right before this happened. I searched the kernel patch and couldn't find anything that seemed to have anything to do with the sg drivers but it just kept bugging me. So while I was waiting I uninstalled the patch and rebooted. Just as it was coming up and before I could retry the set of commands to regenerate the sg commands Symantec called back and we went through a slightly modified version of the commands above and this time the sg drivers were there, and after a bit of re-configuring everything worked just fine.


The difference in the last set of commands is that the last time all we ran was:
sg.build all
mv /kernel/drv/sg.conf /kernel/drv/sg.conf20Mar08
./sg.install
sgscan

And there they were...so it seems unlikely that it was the "sg.build all" instead of the way I ran it before, but I can't rule that out. I guess I would suggest trying it instead of removing the patch first, but expect to have to remove the patch to get it to work.

Some commands I learned/relearned/whatever during this process:
devfsadm
Maintains /dev

tpconfig -d
Shows NetBackup tape configuration

/usr/openv/volmgr/bin/scan
Shows detailed tape drive information

cfgadm -al > cfgadmout.txt
Shows dynamically reconfigurable resources listing with attachment points

iostat -En
Gives a descriptive output of all errors reported by terminal, disk, tape and cpu utilization.

1 comment:

Gam0ra said...

http://sunsolve.sun.com/search/document.do?assetkey=1-66-231243-1

That appears to be the same problem I had, but with a different set of patches. Most likely the patch I applied included the same patches this page lists.