Solaris Volume Manager has also been variously known as Disksuite or ODS (but not the Solaris Volume Manager which was the rebadged and bundled Veritas Volume Manager!) and comes with lots of neat features. One of the best is simple method for expanding metadevices and filesystems – on a live system!
For my example today, I’m going to use a Sun T2000 which has 4 zones, all of which are running from a metadevice composed of several 10gb LUNs from a SAN. The object of this exercise is to add another 10gb to the zone filesystem, so the developers can fire up another zone.
The fact that we’re using a metaset from a SAN doesn’t matter – the key thing here is that, by encapsulating a disk (or LUN, or partition) and using a metadevice, we can quickly and easily expand live filesystems.
First of all, let’s have a look at the metaset on the host, which has been imaginatively called ‘zones’:
bash-3.00# metaset -s zones
Set name = zones, Set number = 2
Host Owner
bumpkin Yes (auto)
Drive Dbase
/dev/dsk/c6t60060E80141189000001118900001918d0 Yes
/dev/dsk/c6t60060E80141189000001118900001919d0 Yes
/dev/dsk/c6t60060E80141189000001118900002133d0 Yes
We can use the metastat command to print a summary of the metadevices within the metaset:
bash-3.00# metastat -s zones -c
zones/d100 s 29GB /dev/dsk/c6t60060E80141189000001118900001918d0s0 /dev/dsk/c6t60060E80141189000001118900001919d0s0 /dev/dsk/c6t60060E80141189000001118900002133d0s0
In this case, we’ve got just one metadevice, named d100, which is composed of three 10gb LUNs from the SAN.
So, our first task is to add the newly available LUN to the metaset:
bash-3.00# metaset -s zones -a c6t60060E80141189000001118900001731d0
We can check that it’s really been added with the metaset command again:
bash-3.00# metaset -s zones
Set name = zones, Set number = 2
Host Owner
bumpkin Yes (auto)
Drive Dbase
/dev/dsk/c6t60060E80141189000001118900001918d0 Yes
/dev/dsk/c6t60060E80141189000001118900001919d0 Yes
/dev/dsk/c6t60060E80141189000001118900002133d0 Yes
/dev/dsk/c6t60060E80141189000001118900001731d0 Yes
Rock on! We’ve now got our 4 10gb LUNs added to the metaset. Now we need to attach the new LUN to our existing 30gb metadevice, d100:
bash-3.00# metattach zones/d100 /dev/dsk/c6t60060E80141189000001118900001731d0s0
zones/d100: component is attached
Note that we don’t need to bring down the three running zones – we can do all of this live, with the system at the multi-user-server milestone.
If we query the metadevice now we can see that it’s grown, from a stated 29GB to 39GB, and that our new LUN is part of the metadevice:
bash-3.00# metastat -s zones -c
zones/d100 s 39GB /dev/dsk/c6t60060E80141189000001118900001918d0s0 /dev/dsk/c6t60060E80141189000001118900001919d0s0 /dev/dsk/c6t60060E80141189000001118900002133d0s0 /dev/dsk/c6t60060E80141189000001118900001731d0s0
Now all we need to do is grow the filesystem, using all of that extra 10gb:
bash-3.00# growfs -M /export/zones /dev/md/zones/rdsk/d100
/dev/md/zones/rdsk/d100: 83742720 sectors in 13630 cylinders of 48 tracks, 128 sectors
40890.0MB in 852 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
................
super-block backups for last 10 cylinder groups at:
82773280, 82871712, 82970144, 83068576, 83167008, 83265440, 83363872,
83462304, 83560736, 83659168
Here’s the output of df before we hacked about:
bash-3.00# df -k /export/zones
Filesystem kbytes used avail capacity Mounted on
/dev/md/zones/dsk/d100
30928078 14451085 16270806 48% /export/zones
And there’s the output after we’ve expanded the filesystem:
bash-3.00# df -k /export/zones
Filesystem kbytes used avail capacity Mounted on
/dev/md/zones/dsk/d100
41237442 14462125 26569130 36% /export/zones
So, a quick and simple way to grow a filesystem under Solaris, using metadevices and with no downtime.
Also, a brief note to Sun product managers: choose a name for your products, and stick with that name for more than a year. Thanks!
So, Solaris comes with lots of nice tools for querying our SAN HBAs, but the ones we’ve looked at so far are only of any real use when the HBA has a live connection to it.
What about when we want to find the WWN to setup our SAN, before we’ve plugged any fibre in?
picl is a hardware monitoring daemon in Solaris. I first started playing with it on mid-frame and high-end machines (SF6500s and F15ks) where the system controller (SC) talked to picl to do hardware monitoring of a Solaris domain.
We can talk to picl ourselves with prtpicl. We need the verbose option to get something useful, but be warned – this will dump out pages and pages of stuff – so we need to filter it a bit with grep.
root@avalon>prtpicl -v | grep wwn
:node-wwn 20 00 00 e0 8b 1e a9 ef
:port-wwn 21 00 00 e0 8b 1e a9 ef
:node-wwn 20 00 00 e0 8b 3e a9 ef
:port-wwn 21 01 00 e0 8b 3e a9 ef
:node-wwn 20 00 00 e0 8b 80 9c a8
:port-wwn 21 00 00 e0 8b 80 9c a8
:node-wwn 20 00 00 e0 8b a0 9c a8
:port-wwn 21 01 00 e0 8b a0 9c a8
:node-wwn 20 00 00 03 ba db e9 89
:port-wwn 21 00 00 03 ba db e9 89
:node-wwn 20 00 00 00 87 83 fd 1c
:port-wwn 21 00 00 00 87 83 fd 1c
:node-wwn 20 00 00 00 87 84 4a d8
:port-wwn 21 00 00 00 87 84 4a d8
These are the node WWNs that we’re after, with the first one being c2, the second c3, and so on. The internal controller is last, and we can see the WWNs of the two FC disks that are hanging off it. (Remember, on a V490 we have internal FC-AL disks, not SCSI).
Finally, for our last trick, if we have Solaris 10 01/06 or later, we can use the awesome fcinfo command, which makes all of this very, very easy indeed.
root@avalon # fcinfo hba-port
HBA Port WWN: 210000e08b1ea9ef
OS Device Name: /dev/cfg/c2
Manufacturer: QLogic Corp.
Model: QLE2460
Type: unknown
State: offline
Supported Speeds: 1Gb 2Gb 4Gb
Current Speed: not established
Node WWN: 200000e08b1ea9ef
Easy! Another good reason for upgrading to Solaris 10 – there are lots of nice tools and new features like this that make the day to day administration much easier.
When connecting a Solaris machine to a SAN, you’ll usually need to know the WWN of the host bus adapter (HBA). WWNs are a bit like MAC addresses for ethernet cards – they are unique, and they’re used to manage who is connected to what, and what they can see.
The quickest and easiest way to check the WWN is when we have an active HBA. We can use the cfgadm command under Solaris to check our adapter states:
root@avalon>cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 fc-private connected configured unknown
c1::210000008783fd1c disk connected configured unknown
c1::2100000087844ad8 disk connected configured unknown
c2 fc-private connected configured unknown
c2::50060e8014118920 disk connected configured unknown
c3 fc connected unconfigured unknown
c4 fc-private connected configured unknown
c4::50060e8014118930 disk connected configured unknown
c5 fc connected unconfigured unknown
usb0/1 unknown empty unconfigured ok
usb0/2 unknown empty unconfigured ok
usb0/3 unknown empty unconfigured ok
usb0/4 unknown empty unconfigured ok
So both our controllers, c2 and c4, have active loops. Now we can use luxadm to query the driver and print out the device paths for each port on each HBA:
root@avalon>luxadm qlgc
Found Path to 5 FC100/P, ISP2200, ISP23xx Devices
Opening Device: /devices/pci@8,700000/SUNW,qlc@2/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter Driver: 1.14.09 03/08/04
Opening Device: /devices/pci@8,700000/SUNW,qlc@2,1/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter Driver: 1.14.09 03/08/04
Opening Device: /devices/pci@8,700000/SUNW,qlc@3/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter Driver: 1.14.09 03/08/04
Opening Device: /devices/pci@8,700000/SUNW,qlc@3,1/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter Driver: 1.14.09 03/08/04
Opening Device: /devices/pci@9,600000/SUNW,qlc@2/fp@0,0:devctl
Detected FCode Version: ISP2200 FC-AL Host Adapter Driver: 1.15 04/03/22
Complete
This particular machine I’m playing on is a Sun v490, which uses internal FC-AL disks – so the sixth controller port we can see (the ISP2200) is the internal controller for the internal root disks. Why the sixth? Due to the way the V490 initialises itself, the internal controller is tested and configured after all the PCI slots.
Also, if you look at the device path, you can see it’s coming from a different PCI bus – pci@9 as opposed to pci@8
Finally, the FCode and driver version are different, which shows us it’s a slightly different chipset from the other HBAs.
REMEMBER: numbering starts from the top (the first device) down. So:
/devices/pci@8,700000/SUNW,qlc@2/fp@0,0:devctl is c2
/devices/pci@8,700000/SUNW,qlc@2,1/fp@0,0:devctl is c3
/devices/pci@8,700000/SUNW,qlc@3/fp@0,0:devctl is c4
/devices/pci@8,700000/SUNW,qlc@3,1/fp@0,0:devctl is c5
/devices/pci@9,600000/SUNW,qlc@2/fp@0,0:devctl is c1, our internal HBA
We can now use the dump_map option from pci@9 to print out the device map, as seen from each port.
For c2, for example, we would do:
root@avalon>luxadm -e dump_map /devices/pci@8,700000/SUNW,qlc@2/fp@0,0:devctl
Pos AL_PA ID Hard_Addr Port WWN Node WWN Type
0 1 7d 0 210000e08b1ea9ef 200000e08b1ea9ef 0x1f (Unknown Type,Host Bus Adapter)
1 b1 21 b1 50060e8014118920 50060e8014118920 0x0 (Disk device)
And there is our listing of WWNs. The 50060e8014118920 WWN belongs to our SAN device at the other end (note the type of ‘0x0 Disk device’), and the first WWN of 210000e08b1ea9ef is for our HBA.
Note that this just works for cards which have an active connection to a SAN fabric. If we haven’t plugged them in yet, we need to use some lower level Solaris tools, which I’ll be covering in another post.
As previously posted about, the Solaris install is a bit of a slug due to the way the package manager processes the huge amount of files contained in a package.
Flash Archives (flar files) are one solution to this problem. Let’s say you’ve built a model install – everything is in place, all your custom packages, banners, OE tweaks – the works. Wouldn’t it be nice if you could take an image of that Solaris install, to slap down somewhere else?
Or maybe take point in time image snapshots of an environment, for use by developers to let them quickly roll back to a known previous version of their software?
Flash archives let you do all of this. At their most basic, they’re a way of taking an image of a Solaris install. You end up with a single (large) file, which can then be archived/transferred/restored/whatever.
flars are easy to use – the flarcreate command is all you need:
bash-3.00# flarcreate -n "T2k Sol10 8/07" -x /var/flars -R / /var/flars/sun4v_sol_10_Generic_120011-14.flar
Full Flash
Checking integrity...
Integrity OK.
Running precreation scripts...
Precreation scripts done.
Creating the archive...
2019951681 blocks
Archive creation complete.
Running postcreation scripts...
Postcreation scripts done.
Running pre-exit scripts...
Pre-exit scripts done.
bash-3.00#
The syntax is pretty straightforward:
- -n specifies the name you’re giving to your flar
- -x says what files and directories to exclude – in this case, we don’t want to include the directory where we’re creating and storing our flars
- -R says where the root of the flar will be – in this case we’re imaging a full Solaris install, so we want to start from /
- and the final part is the full path and filename of the flar we are creating
One thing that it’s important to remember, is that flarcreate will follow symlinks and mount points. If you have a 100gb /export/home, it will try and add that to your flar. This may not be what you want – especially if you’re creating a Solaris image for Jumpstart – so flars are best created from single user mode, when the system is idle and nothing unnecessary is mounted or running.
Another important point to note is that flars are hardware class dependant. If I create an image of a sun4v Solaris install (in this case, a trusty Sun T2000) then the flar will just contain the kernel and driver files for sun4v. If you try and boot a sun4u box (like a Sun V440) it’s going to go horribly wrong.
If you want to use flars to Jumpstart machines, you’ll need to have a master flar image for each machine hardware class you have in your estate – we can find this out with the uname command:
bash-3.00# uname -m
sun4u
We can use flar info to query a flar file to see what it contains:
bash-3.00# flar info sun4v_sol_10_Generic_120011-14.flar
archive_id=afe30bf4ebb65085a54c5179a6f62a1c
files_archived_method=cpio
creation_date=20081118211329
creation_master=sunt2k-001
content_name=solaris_10_4v_Generic_120011-14
creation_node=sunt2k-001
creation_hardware_class=sun4v
creation_platform=SUNW,Sun-Fire-T200
creation_processor=sparc
creation_release=5.10
creation_os_name=SunOS
creation_os_version=Generic_120011-14
files_compressed_method=none
files_archived_size=4039903362
files_unarchived_size=4039903362
content_architectures=sun4v
type=FULL
And that’s it, basically. There’s not much to it, and it’s pretty simple to use. flars function in pretty much the same way as a normal archive file – we can use the flar command to list files inside the archive and find out more about the archive itself.
It’s a common complaint, and it’s a very valid one. No one could ever accuse the Solaris installation process of being speedy. Even with all the CD swapping,the IRIX install is faster – and that’s saying something.
Each time the Solaris package installer, pkgadd, adds a new package, it rewrites the entire contents of /var/sadm/install/contents – in order. This is how the package manager keeps track of the package manifest, and this is a design flaw going way back to the days when AT&T and Sun came up with the idea of SVR4 UNIX. They just didn’t plan for over 10,000 files spread across over 1,000 packages, which is what a normal install of Solaris slaps down.
A potential solution that was floated for Solaris 10 was to use SQLite to handle the package database, but that uncovered another problem – performance tanked even further on small memory systems.
The real solution? Stop using flat files – but that’s an architecture decision that has lots of consequences for backwards compatibility. OpenSolaris is addressing this with an entirely new package management system. So far it’s looking pretty slick.
In the meantime – what to do? Install from flash archives – flars – and preferably via Wanboot. I’ll be blogging more about those two technologies shortly, but they can have a hugely beneficial impact on your Solaris estate.