May 232008
 

This is a study we’d (a former colleague of mine and yours truly) done last year of a log analysis tool called Splunk
What you will read in this article are the results/excerpts of that study.

Some of the questions we asked are as follows —

*
  • Why use a log analysis tool?
  • What do most shops use
  • What does a tool such as splunk buy us (as an IT shop)
  • What are it’s benefits and pit-falls?
  • What is the cost of ownership?

Why use a log-analysis tool?

The biggest reason to use such a tool would be to move from a Reactive to Proactive Systems Management paradigm

With the number of systems (about 900+ *nix servers in that shop) and the criticality (many systems cost millions of dollars in down-time) of availability of these, it is imperative to find a tool that can actually be used quickly and effortlessly to analyze valuable log information

If such a tool can look at various layers of a “delivered stack” (aka hardware, os, application, network, san, etc), it would be a gold-mine by virtue of being able to link the stack “end-to-end” and by speeding up the analysis process.

What do most shops use?

Most shops I’ve been in do log analysis like this —

a) Don’t do any log analysis unless absolutely required. And if it is required, admins log into the individual servers and parse through the logs using vi (or using a combination of grep/awk/sed if they are script-savvy)

b) Have a centralized ssh (or god forbid! rsh) trusted admin host from where they launch a log parser script that filters specific key words and that gets emailed to a mailbox or to the individual admins’ email boxes

c) have a centralized log host where they run a script akin to the one mentioned above

I’ve worked in shops of varying sizes — from a ISP/Telecom giant who ran 4000+ sun servers to a 50-server tiny sweatshop. Most of the shops I’ve been in fall some where in between (with hosts ranging from 200 – 1000 in number). That’s a lot of hosts to manage and a lot of logging that needs to be parsed.

What does a log-analysis tool buy an IT shop?

You’ve all probably thought about this — a centralized, easy-to-use log analysis tool buys an IT shop valuable time!

So what does Splunk claim to do?

In there own words —

“The Splunk Server indexes IT data from ANY source. No need to configure it for specific formats, write regular expressions or change your logging output. Search mountains of data by time, keywords, type of event, source, host or relationships to other events. “

Some key features of Splunk:

  • Universal Indexing
  • Can index terabytes of data all from one place
  • Capable of indexing approx. 22,000 events/second at density of 150 bytes/event.

How does splunk acquire data?

Access data from any live source:

  • Mounted files: NFS/SMB, CIFS/AFP, NAS/SAN, FIFO,
  • Remote files: rsync, scp/ftp/rcp,
  • Network ports: UDP & TCP, syslog/syslog-ng, log4j/log4php, JMX/JMS, SNMP
  • Databases: SQL/ODBC
  • Splunk Servers: Access data locally on production hosts and forward it to another Splunk Server over SSL/TCP

The actual evaluation results will be the next article.

 Posted by at 6:45 pm
Mar 072008
 

/!\ Remember to delete the service maps for NFS using svccfg command

With Solaris 10 and VCS 5.x, nfsd HAS TO run under VCS control. In order to achieve that, the following needs ton happen (on every node that will host the NFS share) —

Disable/Delete the NFS services from SMF

<span class="anchor" id="line-11"></span># svccfg delete -f svc:/network/nfs/server:default<br /><span class="anchor" id="line-12"></span># svccfg delete -f svc:/network/nfs/status:default<br /><span class="anchor" id="line-13"></span># svccfg delete -f svc:/network/nfs/nlockmgr:default<br /><span class="anchor" id="line-14"></span>

Manually restart lockd, statd and automountd

# /usr/lib/nfs/lockd<br /><span class="anchor" id="line-20"></span># /usr/lib/nfs/statd<br /><span class="anchor" id="line-21"></span># /usr/lib/fs/autofs/automount<br /><span class="anchor" id="line-22"></span># /usr/lib/autofs/automountd<br /><span class="anchor" id="line-23"></span>

NOTE: In this example (see below), the NFSgrp is configured only for one node. To add another node, add the node name and number to SystemList and AutoStartList

       group NFSgrp (<br />               SystemList = { hostA = 0 }<br />               AutoStartList = { hostA }<br /><span class="anchor" id="line-30"></span>        )<br /><span class="anchor" id="line-31"></span><br /><span class="anchor" id="line-32"></span>        DiskGroup nfsDG (<br /><span class="anchor" id="line-33"></span>               Critical = 0<br /><span class="anchor" id="line-34"></span>               DiskGroup = testdg<br /><span class="anchor" id="line-35"></span>        )<br /><span class="anchor" id="line-36"></span><br /><span class="anchor" id="line-37"></span>        Volume nfsVOL (<br /><span class="anchor" id="line-38"></span>              Critical = 0<br /><span class="anchor" id="line-39"></span>              Volume = testnfshome<br /><span class="anchor" id="line-40"></span>              DiskGroup = testdg<br /><span class="anchor" id="line-41"></span>        )<br /><span class="anchor" id="line-42"></span><br /><span class="anchor" id="line-43"></span><br /><span class="anchor" id="line-44"></span>        IP IPres (<br /><span class="anchor" id="line-45"></span>                Device = bge0<br />               Address = "10.10.10.22"<br /><span class="anchor" id="line-47"></span>                NetMask = "255.255.255.0"<br /><span class="anchor" id="line-48"></span>        )<br /><span class="anchor" id="line-49"></span>        Mount Mountres (<br /><span class="anchor" id="line-50"></span>                MountPoint = "/nfs/testnfs"<br /><span class="anchor" id="line-51"></span>                BlockDevice = "/dev/vx/dsk/testdg/testnfshome"<br /><span class="anchor" id="line-52"></span>                FSType = vxfs<br /><span class="anchor" id="line-53"></span>                MountOpt = rw<br /><span class="anchor" id="line-54"></span>                FsckOpt = "-y"<br /><span class="anchor" id="line-55"></span>        )<br /><span class="anchor" id="line-56"></span>        NFS NFSres (<br /><span class="anchor" id="line-57"></span>                Nservers = 16<br /><span class="anchor" id="line-58"></span>        )<br /><span class="anchor" id="line-59"></span><br /><span class="anchor" id="line-60"></span>        NFSLock NFSLockres (<br /><span class="anchor" id="line-61"></span>                PathName = "/nfs/testnfs"<br /><span class="anchor" id="line-62"></span>        )<br /><span class="anchor" id="line-63"></span>        NIC NICres (<br /><span class="anchor" id="line-64"></span>                Device = bge0<br /><span class="anchor" id="line-65"></span>        )<br /><span class="anchor" id="line-66"></span>        Share Shareres (<br /><span class="anchor" id="line-67"></span>                PathName = "/nfs/testnfs"<br /><span class="anchor" id="line-68"></span>                Options = "-o rw -d \"test home dirs\""<br /><span class="anchor" id="line-69"></span>        )<br /><span class="anchor" id="line-70"></span><br /><span class="anchor" id="line-71"></span><br /><span class="anchor" id="line-72"></span><br /><span class="anchor" id="line-73"></span>        // IPres requires Shareres<br /><span class="anchor" id="line-74"></span>        IPres requires NICres<br /><span class="anchor" id="line-75"></span>        nfsVOL requires nfsDG<br /><span class="anchor" id="line-76"></span>        Mountres requires nfsVOL<br /><span class="anchor" id="line-77"></span>        NFSLockres requires Mountres<br /><span class="anchor" id="line-78"></span>        Shareres requires NFSLockres<br /><span class="anchor" id="line-79"></span>        Shareres requires NFSres<br /><span class="anchor" id="line-80"></span><br /><span class="anchor" id="line-81"></span>        // resource dependency tree<br /><span class="anchor" id="line-82"></span>        //<br /><span class="anchor" id="line-83"></span>        // group NFSgrp<br /><span class="anchor" id="line-84"></span>        // {<br /><span class="anchor" id="line-85"></span>        // IP IPres<br /><span class="anchor" id="line-86"></span>        //      {<br /><span class="anchor" id="line-87"></span>        //      NIC NICres<br /><span class="anchor" id="line-88"></span>        //      Share Shareres<br /><span class="anchor" id="line-89"></span>        //          {<br /><span class="anchor" id="line-90"></span>        //          NFSLock NFSLockres<br /><span class="anchor" id="line-91"></span>        //              {<br /><span class="anchor" id="line-92"></span>        //              Mount Mountres<br /><span class="anchor" id="line-93"></span>        //                  {<br /><span class="anchor" id="line-94"></span>        //                  Volume nfsVOL<br /><span class="anchor" id="line-95"></span>        //                      {<br /><span class="anchor" id="line-96"></span>        //                      DG nfsDG<br /><span class="anchor" id="line-97"></span>        //                      }<br /><span class="anchor" id="line-98"></span>        //                  }<br /><span class="anchor" id="line-99"></span>        //              }<br /><span class="anchor" id="line-100"></span>        //          NFS NFSres<br /><span class="anchor" id="line-101"></span>        //          }<br /><span class="anchor" id="line-102"></span>        //       }<br /><span class="anchor" id="line-103"></span>        // }<br />

 Posted by at 7:17 pm
Jan 102008
 

Solaris8 BrandZ pre-requisites

Requires kernel patch 127111-05 (or latest version) for sparc. Find all dependencies and fulfill then (ie patch requirements).

# ls<br /><span class="anchor" id="line-8"></span>SUNWs8brandr  SUNWs8brandu  SUNWs8p2v<br /><span class="anchor" id="line-9"></span># pwd<br /><span class="anchor" id="line-10"></span>/mypool/software/sol8p2v/s8ma-1_0-rr/Product<br /><span class="anchor" id="line-11"></span><br /><span class="anchor" id="line-12"></span># pkgadd -d .<br /><span class="anchor" id="line-13"></span><br /><span class="anchor" id="line-14"></span>The following packages are available:<br /><span class="anchor" id="line-15"></span>  1  SUNWs8brandr     Solaris 8 Migration Assistant: solaris8 brand support (Root)<br /><span class="anchor" id="line-16"></span>                      (sparc) 11.10.0,REV=2007.10.08.16.51<br /><span class="anchor" id="line-17"></span>  2  SUNWs8brandu     Solaris 8 Migration Assistant: solaris8 brand support (Usr)<br /><span class="anchor" id="line-18"></span>                      (sparc) 11.10.0,REV=2007.10.08.16.51<br /><span class="anchor" id="line-19"></span>  3  SUNWs8p2v        Solaris 8 p2v Tool<br /><span class="anchor" id="line-20"></span>                      (sparc) 11.10.0,REV=2007.10.08.16.51<br /><span class="anchor" id="line-21"></span><br /><span class="anchor" id="line-22"></span>Select package(s) you wish to process (or 'all' to process<br /><span class="anchor" id="line-23"></span>all packages). (default: all) [?,??,q]:<br /><span class="anchor" id="line-24"></span>

The SUNWs8brandr and SUNWs8brandu packages need to be added to the Solaris 10 Host OS (Global Zone).

Zone configuration

Then configure the Zone —

# zonecfg -z s8-zone<br /><span class="anchor" id="line-34"></span>s8-zone: No such zone configured<br /><span class="anchor" id="line-35"></span>Use 'create' to begin configuring a new zone.<br /><span class="anchor" id="line-36"></span>zonecfg:s8-zone> create -t SUNWsolaris8<br /><span class="anchor" id="line-37"></span>zonecfg:s8-zone> set zonepath=/mypool/zones/s8-zone<br /><span class="anchor" id="line-38"></span>zonecfg:s8-zone><br /><span class="anchor" id="line-39"></span>zonecfg:s8-zone> set autoboot=true<br /><span class="anchor" id="line-40"></span>zonecfg:s8-zone> add net<br /><span class="anchor" id="line-41"></span>zonecfg:s8-zone:net> set address=192.168.99.100<br /><span class="anchor" id="line-42"></span>zonecfg:s8-zone:net> set physical=bge1<br /><span class="anchor" id="line-43"></span>zonecfg:s8-zone:net> end<br /><span class="anchor" id="line-44"></span>zonecfg:s8-zone> add fs<br /><span class="anchor" id="line-45"></span>zonecfg:s8-zone:fs> set type=zfs<br /><span class="anchor" id="line-46"></span>zonecfg:s8-zone:fs> set dir=/mypool/vol1<br /><span class="anchor" id="line-47"></span>zonecfg:s8-zone:fs> end<br /><span class="anchor" id="line-48"></span>special not specified<br /><span class="anchor" id="line-49"></span>zonecfg:s8-zone:fs> set special=share/zone/s8-zone<br /><span class="anchor" id="line-50"></span>zonecfg:s8-zone:fs> end<br /><span class="anchor" id="line-51"></span>zonecfg:s8-zone><br /><span class="anchor" id="line-52"></span><br /><span class="anchor" id="line-53"></span>zonecfg:sol8zone> add attr<br /><span class="anchor" id="line-54"></span>zonecfg:sol8zone:attr> set name=hostid<br /><span class="anchor" id="line-55"></span>zonecfg:sol8zone:attr> set type=string<br /><span class="anchor" id="line-56"></span>zonecfg:sol8zone:attr> set value=8325f14d<br /><span class="anchor" id="line-57"></span>zonecfg:sol8zone:attr> end<br /><span class="anchor" id="line-58"></span>zonecfg:sol8zone> verify<br /><span class="anchor" id="line-59"></span>zonecfg:sol8zone> commit<br /><span class="anchor" id="line-60"></span>zonecfg:sol8zone> exit<br /><span class="anchor" id="line-61"></span>dwailsun:$() # zonecfg -z sol8zone info<br /><span class="anchor" id="line-62"></span>zonename: sol8zone<br /><span class="anchor" id="line-63"></span>zonepath: /mypool/zones/sol8zone<br /><span class="anchor" id="line-64"></span>brand: solaris8<br /><span class="anchor" id="line-65"></span>autoboot: false<br /><span class="anchor" id="line-66"></span>bootargs:<br /><span class="anchor" id="line-67"></span>pool:<br /><span class="anchor" id="line-68"></span>limitpriv:<br /><span class="anchor" id="line-69"></span>scheduling-class:<br /><span class="anchor" id="line-70"></span>ip-type: shared<br /><span class="anchor" id="line-71"></span>fs:<br /><span class="anchor" id="line-72"></span>        dir: /mypool/vol1<br /><span class="anchor" id="line-73"></span>        special: share/zone/sol8zone<br /><span class="anchor" id="line-74"></span>        raw not specified<br /><span class="anchor" id="line-75"></span>        type: zfs<br /><span class="anchor" id="line-76"></span>        options: []<br /><span class="anchor" id="line-77"></span>net:<br /><span class="anchor" id="line-78"></span>        address: 192.168.99.100<br /><span class="anchor" id="line-79"></span>        physical: bge1<br /><span class="anchor" id="line-80"></span>attr:<br /><span class="anchor" id="line-81"></span>        name: hostid<br /><span class="anchor" id="line-82"></span>        type: string<br /><span class="anchor" id="line-83"></span>        value: 8325f14d<br /><span class="anchor" id="line-84"></span>dwailsun:$() # zonecfg -z sol8zone info attr<br /><span class="anchor" id="line-85"></span>attr:<br /><span class="anchor" id="line-86"></span>        name: hostid<br /><span class="anchor" id="line-87"></span>        type: string<br /><span class="anchor" id="line-88"></span>        value: 8325f14d<br /><span class="anchor" id="line-89"></span>dwailsun:$() #<br /><span class="anchor" id="line-90"></span>

Install the zone

dwailsun:$() # zonecfg -z sol8zone export > /var/tmp/safe/sol8zone.config<br /><span class="anchor" id="line-96"></span>dwailsun:$(safe) # zoneadm -z s8-zone install -u -a /mypool/software/sol8p2v/solaris8-image.flar<br /><span class="anchor" id="line-97"></span>could not verify fs /mypool/vol1: could not access zfs dataset 'share/zone/s8-zone'<br /><span class="anchor" id="line-98"></span>zoneadm: zone s8-zone failed to verify<br /><span class="anchor" id="line-99"></span><br /><span class="anchor" id="line-100"></span>dwailsun:$(safe) # zfs list<br /><span class="anchor" id="line-101"></span>NAME                  USED  AVAIL  REFER  MOUNTPOINT<br /><span class="anchor" id="line-102"></span>mypool               3.75G  15.4G  39.3K  /mypool<br /><span class="anchor" id="line-103"></span>mypool/software      3.22G  6.78G  3.22G  /mypool/software<br /><span class="anchor" id="line-104"></span>mypool/vol1          66.6K  5.00G  34.0K  /mypool/vol1<br /><span class="anchor" id="line-105"></span>mypool/vol1/s8-zone  32.6K  5.00G  32.6K  /mypool/vol1/s8-zone<br /><span class="anchor" id="line-106"></span>mypool/www            544M  3.47G   544M  /mypool/www<br /><span class="anchor" id="line-107"></span>mypool/zones         34.0K  5.00G  34.0K  /mypool/zones<br /><span class="anchor" id="line-108"></span>dwailsun:$(safe) # zfs set mountpoint=legacy mypool/vol1/s8-zone<br /><span class="anchor" id="line-109"></span>dwailsun:$(safe) # zfs list<br /><span class="anchor" id="line-110"></span>NAME                  USED  AVAIL  REFER  MOUNTPOINT<br /><span class="anchor" id="line-111"></span>mypool               3.75G  15.4G  39.3K  /mypool<br /><span class="anchor" id="line-112"></span>mypool/software      3.22G  6.78G  3.22G  /mypool/software<br /><span class="anchor" id="line-113"></span>mypool/vol1          65.3K  5.00G  32.6K  /mypool/vol1<br /><span class="anchor" id="line-114"></span>mypool/vol1/s8-zone  32.6K  5.00G  32.6K  legacy<br /><span class="anchor" id="line-115"></span>mypool/www            544M  3.47G   544M  /mypool/www<br /><span class="anchor" id="line-116"></span>mypool/zones         34.0K  5.00G  34.0K  /mypool/zones<br /><span class="anchor" id="line-117"></span>dwailsun:$(safe) # zoneadm -z s8-zone install -u -a /mypool/software/sol8p2v/solaris8-image.flar<br /><span class="anchor" id="line-118"></span>      Log File: /var/tmp/s8-zone.install.987.log<br /><span class="anchor" id="line-119"></span>        Source: /mypool/software/sol8p2v/solaris8-image.flar<br /><span class="anchor" id="line-120"></span>    Installing: This may take several minutes...<br /><span class="anchor" id="line-121"></span>Postprocessing: This may take several minutes...<br /><span class="anchor" id="line-122"></span><br /><span class="anchor" id="line-123"></span>        Result: Installation completed successfully.<br /><span class="anchor" id="line-124"></span>      Log File: /mypool/zones/sol8zone/root/var/log/s8-zone.install.987.log<br /><span class="anchor" id="line-125"></span><br /><span class="anchor" id="line-126"></span>

Solaris8 P2V

Run sol8-p2v —

dwailsun:$(safe) # /usr/lib/brand/solaris8/s8_p2v s8-zone<br /><span class="anchor" id="line-135"></span>[Fri Dec 28 12:36:01 PST 2007]         S20_apply_patches:  Unpacking patch:  109 147-44<br /><span class="anchor" id="line-136"></span>[Fri Dec 28 12:36:01 PST 2007]         S20_apply_patches: Installing patch:  109 147-44<br /><span class="anchor" id="line-137"></span><br /><span class="anchor" id="line-138"></span>Checking installed patches...<br /><span class="anchor" id="line-139"></span>Patch 109147-44 has already been applied.<br /><span class="anchor" id="line-140"></span>See patchadd(1M) for instructions.<br /><span class="anchor" id="line-141"></span><br /><span class="anchor" id="line-142"></span>Patchadd is terminating.<br /><span class="anchor" id="line-143"></span>[Fri Dec 28 12:36:09 PST 2007]         S20_apply_patches:  Unpacking patch:  111 023-03<br /><span class="anchor" id="line-144"></span>[Fri Dec 28 12:36:09 PST 2007]         S20_apply_patches: Installing patch:  111 023-03<br /><span class="anchor" id="line-145"></span><br /><span class="anchor" id="line-146"></span>Checking installed patches...<br /><span class="anchor" id="line-147"></span>Patch 111023-03 has already been applied.<br /><span class="anchor" id="line-148"></span>See patchadd(1M) for instructions.<br /><span class="anchor" id="line-149"></span><br /><span class="anchor" id="line-150"></span>Patchadd is terminating.<br /><span class="anchor" id="line-151"></span>[Fri Dec 28 12:36:11 PST 2007]         S20_apply_patches:  Unpacking patch:  111 431-01<br /><span class="anchor" id="line-152"></span>[Fri Dec 28 12:36:11 PST 2007]         S20_apply_patches: Installing patch:  111 431-01<br /><span class="anchor" id="line-153"></span><br /><span class="anchor" id="line-154"></span>Checking installed patches...<br /><span class="anchor" id="line-155"></span>This patch is obsoleted by patch 108993-67 which has already<br /><span class="anchor" id="line-156"></span>been applied to this system.<br /><span class="anchor" id="line-157"></span><br /><span class="anchor" id="line-158"></span>Patchadd is terminating.<br /><span class="anchor" id="line-159"></span>[Fri Dec 28 12:36:13 PST 2007]         S20_apply_patches:  Unpacking patch:  112 605-04<br /><span class="anchor" id="line-160"></span>[Fri Dec 28 12:36:13 PST 2007]         S20_apply_patches: Installing patch:  112 605-04<br /><span class="anchor" id="line-161"></span><br /><span class="anchor" id="line-162"></span>Checking installed patches...<br /><span class="anchor" id="line-163"></span>This patch is obsoleted by patch 108993-67 which has already<br /><span class="anchor" id="line-164"></span>been applied to this system.<br /><span class="anchor" id="line-165"></span><br /><span class="anchor" id="line-166"></span>Patchadd is terminating.<br /><span class="anchor" id="line-167"></span>[Fri Dec 28 12:36:15 PST 2007]         S20_apply_patches:  Unpacking patch:  112 050-04<br /><span class="anchor" id="line-168"></span>[Fri Dec 28 12:36:15 PST 2007]         S20_apply_patches: Installing patch:  112 050-04<br /><span class="anchor" id="line-169"></span><br /><span class="anchor" id="line-170"></span>Checking installed patches...<br /><span class="anchor" id="line-171"></span>Patch 112050-04 has already been applied.<br /><span class="anchor" id="line-172"></span>See patchadd(1M) for instructions.<br /><span class="anchor" id="line-173"></span><br /><span class="anchor" id="line-174"></span>Patchadd is terminating.<br /><span class="anchor" id="line-175"></span>[Fri Dec 28 12:36:17 PST 2007]         S20_apply_patches:  Unpacking patch:  109 221-01<br /><span class="anchor" id="line-176"></span>[Fri Dec 28 12:36:17 PST 2007]         S20_apply_patches: Installing patch:  109 221-01<br /><span class="anchor" id="line-177"></span><br /><span class="anchor" id="line-178"></span>Checking installed patches...<br /><span class="anchor" id="line-179"></span>This patch is obsoleted by patch 109318-39 which has already<br /><span class="anchor" id="line-180"></span>been applied to this system.<br /><span class="anchor" id="line-181"></span><br /><span class="anchor" id="line-182"></span>Patchadd is terminating.<br /><span class="anchor" id="line-183"></span>dwailsun:$(safe) #<br /><span class="anchor" id="line-184"></span><br /><span class="anchor" id="line-185"></span>

dwailsun:$(safe) # zoneadm -z s8-zone boot<br /><span class="anchor" id="line-189"></span>dwailsun:$(safe) # zoneadm list -v<br /><span class="anchor" id="line-190"></span>  ID NAME             STATUS     PATH                           BRAND    IP<br /><span class="anchor" id="line-191"></span>   0 global           running    /                              native   shared<br /><span class="anchor" id="line-192"></span>   3 s8-zone          running    /mypool/zones/sol8zone         solaris8 shared<br /><span class="anchor" id="line-193"></span>dwailsun:$(safe) # zlogin -C s8-zone<br /><span class="anchor" id="line-194"></span>[Connected to zone 's8-zone' console]<br /><span class="anchor" id="line-195"></span><br /><span class="anchor" id="line-196"></span><br /><span class="anchor" id="line-197"></span>You did not enter a selection.<br /><span class="anchor" id="line-198"></span>What type of terminal are you using?<br /><span class="anchor" id="line-199"></span> 1) ANSI Standard CRT<br /><span class="anchor" id="line-200"></span> 2) DEC VT52<br /><span class="anchor" id="line-201"></span> 3) DEC VT100<br /><span class="anchor" id="line-202"></span> 4) Heathkit 19<br /><span class="anchor" id="line-203"></span> 5) Lear Siegler ADM31<br /><span class="anchor" id="line-204"></span> 6) PC Console<br /><span class="anchor" id="line-205"></span> 7) Sun Command Tool<br /><span class="anchor" id="line-206"></span> 8) Sun Workstation<br /><span class="anchor" id="line-207"></span> 9) Televideo 910<br /><span class="anchor" id="line-208"></span> 10) Televideo 925<br /><span class="anchor" id="line-209"></span> 11) Wyse Model 50<br /><span class="anchor" id="line-210"></span> 12) X Terminal Emulator (xterms)<br /><span class="anchor" id="line-211"></span> 13) Other<br /><span class="anchor" id="line-212"></span>Type the number of your choice and press Return: 12<br /><span class="anchor" id="line-213"></span>Configuring network interface addresses: bge1.<br /><span class="anchor" id="line-214"></span>RPC: Timed out<br /><span class="anchor" id="line-215"></span>

Then it goes through and does the sysidcfg bit…

System identification is completed.<br /><span class="anchor" id="line-222"></span><br /><span class="anchor" id="line-223"></span>rebooting system due to change(s) in /etc/default/init<br /><span class="anchor" id="line-224"></span><br /><span class="anchor" id="line-225"></span>Dec 28 12:41:25 rpcbind: rpcbind terminating on signal.<br /><span class="anchor" id="line-226"></span>System identification is completed.<br /><span class="anchor" id="line-227"></span><br /><span class="anchor" id="line-228"></span><br /><span class="anchor" id="line-229"></span>[NOTICE: Zone rebooting]<br /><span class="anchor" id="line-230"></span><br /><span class="anchor" id="line-231"></span>SunOS Release 5.8 Version Generic_Virtual 64-bit<br /><span class="anchor" id="line-232"></span>Copyright 1983-2000 Sun Microsystems, Inc.  All rights reserved<br /><span class="anchor" id="line-233"></span><br /><span class="anchor" id="line-234"></span>Hostname: sol8virt<br /><span class="anchor" id="line-235"></span>The system is coming up.  Please wait.<br /><span class="anchor" id="line-236"></span>starting rpc services: rpcbind done.<br /><span class="anchor" id="line-237"></span>syslog service starting.<br /><span class="anchor" id="line-238"></span>Print services started.<br /><span class="anchor" id="line-239"></span>Dec 28 14:41:37 sol8virt sendmail[4102]: My unqualified host name (sol8virt) unknown; sleeping for retry<br /><span class="anchor" id="line-240"></span>The system is ready.<br /><span class="anchor" id="line-241"></span><br /><span class="anchor" id="line-242"></span>sol8virt console login:<br /><span class="anchor" id="line-243"></span><br /><span class="anchor" id="line-244"></span>

# uname -a<br /><span class="anchor" id="line-248"></span>SunOS sol8virt 5.8 Generic_Virtual sun4u sparc SUNW,A70<br /><span class="anchor" id="line-249"></span># exit<br /><span class="anchor" id="line-250"></span><br /><span class="anchor" id="line-251"></span>[Connection to zone 's8-zone' pts/5 closed]<br /><span class="anchor" id="line-252"></span>dwailsun:$(safe) # uname -a<br /><span class="anchor" id="line-253"></span>SunOS dwailsun 5.10 Generic_127111-05 sun4u sparc SUNW,A70<br /><span class="anchor" id="line-254"></span>dwailsun:$(safe) # zlogin s8-zone<br /><span class="anchor" id="line-255"></span>[Connected to zone 's8-zone' pts/5]<br /><span class="anchor" id="line-256"></span>Last login: Fri Dec 28 14:43:35 on pts/5<br /><span class="anchor" id="line-257"></span>Sun Microsystems Inc.   SunOS 5.8       Generic Patch   February 2004<br /><span class="anchor" id="line-258"></span># uname -a<br /><span class="anchor" id="line-259"></span>SunOS sol8virt 5.8 Generic_Virtual sun4u sparc SUNW,A70<br /><span class="anchor" id="line-260"></span>#<br /><span class="anchor" id="line-261"></span># cat /etc/release<br /><span class="anchor" id="line-262"></span>                       Solaris 8 2/04 s28s_hw4wos_05a SPARC<br /><span class="anchor" id="line-263"></span>           Copyright 2004 Sun Microsystems, Inc.  All Rights Reserved.<br /><span class="anchor" id="line-264"></span>                            Assembled 08 January 2004<br /><span class="anchor" id="line-265"></span>#<br /><span class="anchor" id="line-266"></span><br /><span class="anchor" id="line-267"></span>

/!\ Think of a optimal battery of tests that can help us determine whether this virtualized solaris 8 is a viable platform for servers that cannot be migrated….

  • Adding packages — pkgadd works

# uname -a<br /><span class="anchor" id="line-275"></span>SunOS sol8virt 5.8 Generic_Virtual sun4u sparc SUNW,A70<br /><span class="anchor" id="line-276"></span># pkginfo|grep -i smc<br /><span class="anchor" id="line-277"></span>application SMCgcc         gcc<br /><span class="anchor" id="line-278"></span>application SMCliconv      libiconv<br /><span class="anchor" id="line-279"></span>application SMClintl       libintl<br /><span class="anchor" id="line-280"></span>application SMCosh471      openssh<br /><span class="anchor" id="line-281"></span>application SMCossl        openssl<br /><span class="anchor" id="line-282"></span>application SMCzlib        zlib<br /><span class="anchor" id="line-283"></span>

(!) Set up sshd after adding these packages, complete with start up scripts, sshd privsep user id in the system accounts files (passwd and shadow).

# /etc/init.d/sshd start<br /><span class="anchor" id="line-288"></span>Could not load host key: /usr/local/etc/ssh_host_key<br /><span class="anchor" id="line-289"></span>Could not load host key: /usr/local/etc/ssh_host_dsa_key<br /><span class="anchor" id="line-290"></span>Disabling protocol version 1. Could not load host key<br /><span class="anchor" id="line-291"></span># ps -ef|grep sshd<br /><span class="anchor" id="line-292"></span>    root  5086  4609  0 15:18:13 ?        0:00 /usr/local/sbin/sshd<br /><span class="anchor" id="line-293"></span>

Installing Oracle 8i

Setting up Oracle 8i was a breeze. Simply dumped the 2 cds of Oracle 8i 64-bit installation media onto a solaris8 zone visible fileystem and ran the runInstaller with all defaults and the demo database (scott/tiger) getting created as the end step.

/!\ Make sure to copy the media to local disk when installing inside the zone. The reason being, even though the cdrom can be exported to the local zone from the Global zone this way —

add fs<br /><span class="anchor" id="line-303"></span>set dir=/mnt<br /><span class="anchor" id="line-304"></span>set special=/cdrom<br /><span class="anchor" id="line-305"></span>set type=lofs<br /><span class="anchor" id="line-306"></span>add options ro<br /><span class="anchor" id="line-307"></span>add options nodevices<br /><span class="anchor" id="line-308"></span>end<br /><span class="anchor" id="line-309"></span>

We would have issues ejecting and inserting new cdroms, etc.

dwailsun:$() # ssh oracle@sol8virt<br /><span class="anchor" id="line-315"></span>oracle@sol8virt's password:<br /><span class="anchor" id="line-316"></span>Last login: Thu Jan  3 11:27:25 2008 from 10.119.10.4<br /><span class="anchor" id="line-317"></span>Sun Microsystems Inc.   SunOS 5.8       Generic Patch   February 2004<br /><span class="anchor" id="line-318"></span>Sun Microsystems Inc.   SunOS 5.8       Generic Patch   February 2004<br /><span class="anchor" id="line-319"></span>$ ps -ef|grep ora<br /><span class="anchor" id="line-320"></span>  oracle 22608 22152  0 11:25:48 ?        0:00 ora_reco_brandz<br /><span class="anchor" id="line-321"></span>  oracle 22610 22152  0 11:25:48 ?        0:00 ora_snp0_brandz<br /><span class="anchor" id="line-322"></span>  oracle 22626 22152  0 11:26:55 ?        0:00 /export/shared/oracle/OraHome1/bin/tnslsnr LISTENER -inherit<br /><span class="anchor" id="line-323"></span>  oracle 22614 22152  0 11:25:48 ?        0:00 ora_snp2_brandz<br /><span class="anchor" id="line-324"></span>  oracle 22687 22685  0 11:56:04 ?        0:00 /usr/local/sbin/sshd -R<br /><span class="anchor" id="line-325"></span>  oracle 22695 22689  0 11:56:09 pts/6    0:00 grep ora<br /><span class="anchor" id="line-326"></span>  oracle 22689 22687  0 11:56:04 pts/6    0:00 -ksh<br /><span class="anchor" id="line-327"></span>  oracle 22604 22152  4 11:25:48 ?        1:04 ora_ckpt_brandz<br /><span class="anchor" id="line-328"></span>  oracle 22600 22152  0 11:25:48 ?        0:00 ora_dbw0_brandz<br /><span class="anchor" id="line-329"></span>  oracle 22598 22152  0 11:25:48 ?        0:00 ora_pmon_brandz<br /><span class="anchor" id="line-330"></span>  oracle 22620 22152  0 11:25:48 ?        0:00 ora_d000_brandz<br /><span class="anchor" id="line-331"></span>  oracle 22602 22152  0 11:25:48 ?        0:02 ora_lgwr_brandz<br /><span class="anchor" id="line-332"></span>  oracle 22618 22152  0 11:25:48 ?        0:00 ora_s000_brandz<br /><span class="anchor" id="line-333"></span>  oracle 22616 22152  0 11:25:48 ?        0:00 ora_snp3_brandz<br /><span class="anchor" id="line-334"></span>  oracle 22612 22152  0 11:25:48 ?        0:00 ora_snp1_brandz<br /><span class="anchor" id="line-335"></span>  oracle 22606 22152  0 11:25:48 ?        0:00 ora_smon_brandz<br /><span class="anchor" id="line-336"></span>$

 Posted by at 10:07 pm
Aug 222007
 

I am a regular of the ZDNet blog by Paul Murphy and thought I’d add to his thoughts on Virtualization and all the brouhaha that’s going on these days —

Virtualization? uh huh… by ZDNet‘s Paul Murphy — Virtualization is popular because it was popular – and not because there’s a practical reason to do it.

The most interesting thing I discovered in the process of working on a “high-visibility” project (ERP solution) is that most mgt-types don’t understand what Virtualization has to offer. Someone high up (high-up enough I guess) decides that Virtualization is the answer to all evils that haunt a modern datacenter. The claims are that —

  1. Virtualization reduces server sprawl
  2. Virtualization reduces power and cooling footprints
  3. It empowers the IT support organization to be agile (read build more boxes fast) and really support a dynamic business (with lots of development type activities going on)
  4. It is a cure for many problems..blah blah

But when you look at what you’re saving on the standard UNIX platforms (except Sun), the costs amount to something exorbitant. I won’t name the vendor, but it charges for everything starting from it’s multi-pathing software to Resource Mgt software to Virtualization, and they charge by the core.

Soon you start thinking, does this really buy me the cost savings by reducing server-sprawl?
Then the vendor will say, “Why look at this as a consolidation platform? Why don’t you think about the flexibility you’ll get by using this model? Moving workloads around on the fly, etc?”

The problem with that is that Workload management (called SLOs I believe) calls for very detailed and in-depth recording of metrics (what kind of loads are generated by applications, starting by categorizing by application types, etc.

So you first identify the right kinds of metrics to track. The collect the data for a reasonable period of time (say 3-4 months). Then, only after munging all that data, is it possible to say with any authority that a certain amount of resources are required for a particular workload (and build a system that can manage those resource requirements on the fly). T

his entire process might take about 1 year (from start to finish) before being a viable option (some shops I’ve been in are better equipped to do this kind of measurements than others — depending on how “modern” the IT organization usually is — does it “REALLY” employ standards such as ITIL or not, etc).

I’d say that something like Sun’s container model on the Cool-threads servers would be more appropriate for all the above criteria. Consolidation, Resource management, flexibility, etc.

  • SRM has been free with Solaris since Solaris 9.
  • Solaris 10 has the virtualization pieces completely free.
  • The hardware is cheap(er than the competition’s for sure)

 Posted by at 7:55 pm
Aug 222007
 

Install the VCS Packages after patching the server to appropriate/recommended Patch list.

VCS LICENSE KEY : !@$-@$%-(*&^-$%@-$%%-!

List of VCS Packages:

VRTSappqw VRTSvcs VRTSvcsqw
VRTScscm VRTSvcsag VRTSvcsw
VRTSgab VRTSvcsdc VRTSvlic
VRTSllt VRTSvcsmg VRTSweb
VRTSoraqw VRTSvcsmn VRTSperl VRTSvcsor

edit /etc/llthosts (on both servers – for a 2 node cluster)

0 hostd02
1 hostd03

edit /etc/llttab

set-node hostd03 #here the nodename will change with each host
set-cluster 54 #Set the appropriate cluster ID
link qfe1 /dev/qfe:1 – ether – – #heartbeat 1
link qfe5 /dev/qfe:5 – ether – – #heartbeat 2
link-lowpri qfe0 /dev/qfe:0 – ether – – #Low-pri heartbeat

Edit the /etc/gabtab file with

cat > /etc/gabtab <<EOGAB
gabconfig -c -n 2
EOGAB

#Here the number after the “-n” varies with the number of nodes in cluster

Edit the main.cf (/etc/VRTSvcs/conf/config) to match your reqs

##Only on the first/main server of the Cluster

##Start of main.cf##

include “types.cf”
include “OracleTypes.cf”

cluster OneBill_Prod (
UserNames = { admin = “cDRpdxPmHpzS.” }
Administrators = { admin }
CounterInterval = 5
)

system hostd02 (
)

system hostd03 (
)

group network_grp (
SystemList = { hostd02 = 0, hostd03 = 1 }
PrintTree = 0
Parallel = 1
AutoStartList = { hostd02, hostd03 }
)

NIC OneBillv1_nic (
Device = qfe0
NetworkType = ether
)

Phantom OneBillv1_phantom (
)

group oracle_grp (
SystemList = { hostd02 = 0, hostd03 = 1 }
PrintTree = 0
AutoStartList = { hostd02 }
)

DiskGroup orashrdg_dg (
DiskGroup = orashrdg
)

IP OneBillv1_vip (
Device = qfe0
Address = “112.64.90.54
NetMask = “255.255.255.0
IfconfigTwice = 1
)

Mount au1_mnt (
MountPoint = “/au1”
BlockDevice = “/dev/vx/dsk/orashrdg/au1”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount bu1_mnt (
MountPoint = “/bu1”
BlockDevice = “/dev/vx/dsk/orashrdg/bu1”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount u01_mnt (
MountPoint = “/au1”
BlockDevice = “/dev/vx/dsk/orashrdg/au1”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount bu1_mnt (
MountPoint = “/bu1”
BlockDevice = “/dev/vx/dsk/orashrdg/bu1”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount u01_mnt (
MountPoint = “/u01”
BlockDevice = “/dev/vx/dsk/orashrdg/u01”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount u02_mnt (
MountPoint = “/u02”
BlockDevice = “/dev/vx/dsk/orashrdg/u02”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount u03_mnt (
MountPoint = “/u03”
BlockDevice = “/dev/vx/dsk/orashrdg/u03”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount u04_mnt (
MountPoint = “/u04”
BlockDevice = “/dev/vx/dsk/orashrdg/u04”
BlockDevice = “/dev/vx/dsk/orashrdg/u04”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Mount u05_mnt (
MountPoint = “/u05”
BlockDevice = “/dev/vx/dsk/orashrdg/u05”
FSType = vxfs
MountOpt = rw
FsckOpt = “-y”
)

Proxy OneBillv1_proxy (
TargetResName = OneBillv1_nic
)

Volume au1_vol (
Volume = au1
DiskGroup = orashrdg
)

Volume bu1_vol (
Volume = bu1
DiskGroup = orashrdg
)

Volume u01_vol (
Volume = u01
DiskGroup = orashrdg
)

Volume u02_vol (
Volume = u02
DiskGroup = orashrdg
)

Volume u03_vol (
Volume = u03
DiskGroup = orashrdg
)

Volume u04_vol (
Volume = u04
DiskGroup = orashrdg
)

Volume u05_vol (
Volume = u05
DiskGroup = orashrdg
)

OneBillv1_vip requires OneBillv1_proxy
au1_mnt requires au1_vol
au1_mnt requires orashrdg_dg
bu1_mnt requires bu1_vol
bu1_vol requires orashrdg_dg
u01_mnt requires u01_vol
u01_vol requires orashrdg_dg
u02_mnt requires u02_vol
u02_vol requires orashrdg_dg
u03_mnt requires u03_vol
u03_vol requires orashrdg_dg
u04_mnt requires u04_vol
u04_vol requires orashrdg_dg
u05_mnt requires u05_vol
u05_vol requires orashrdg_dg

##End of main.cf##

Copy OracleTypes.cf, etc to the config directory

From /etc/VRTSvcs/conf/config run

opt/VRTSvcs/bin/hacf -verify .

###(Fix errors as you get them)

Setting up GAB and LLT

sbin/gabconfig -U
/sbin/lltconfig -U
/sbin/lltconfig -c
/sbin/gabconfig -c -n 2
/sbin/lltconfig -a list

##Make sure Filesystems (Shared Filesystems) are commented out of the /etc/vfstab file

#Make sure each node in the cluster has the host/IP information of every other in it’s local hosts file#

Reboot the servers, bringing up the main server/node up first

On each node of the cluster

  • /sbin/vxlicinst -k <KEY>
  • /opt/VRTSvcs/bin/hastop -local -force
  • /opt/VRTSvcs/bin/hastart

Create Mount points on all nodes for Shared Filesystems

for i in au1 bu1 u01 u02 u03 u04 u05
do
if [ ! -d $i ]; then
mkdir $i
fi
done

Test failovers by bringing down resources and checking the failover

 Posted by at 5:54 pm
Jul 182007
 

Displays existing DG resources in the Cluster

<span style="font-size:85%;">scstat -D<br /><span class="anchor" id="line-327"></span></span>

Registering VxVM DGs

<span style="font-size:85%;">scconf -a -D type=vxvm,name=<dgname>. \<br /><span class="anchor" id="line-333"></span>nodelist=<node1>:<node2>, \<br /><span class="anchor" id="line-334"></span>preferenced=true,failback=enabled<br /><span class="anchor" id="line-335"></span></node2></node1></dgname></span>

  • nodelist should contain only nodes that are physically connected to the disks of that dg.
  • preferenced=true/false affects whether nodelist indiciates an order of failover preference. On a two-node cluster, this options is only meaningful if failback is enabled.
  • failback=disabled/enabled affects whether a preferred node “takes back” it’s device group when it joins the cluster. The default value is disabled. When faileback is disabled, preferenced is set to false. If it is enabled, preferenced also must be set to true.

Moving DGs across nodes of a cluster

When VxVM dgs are registered as Sun Cluster resources, NEVER USE vxdg import/deport commands to change ownership (node-wise) of the dgs. This will cause SC to treat dg as failed resource.

Use the following command instead:

<span style="font-size:85%;"># scswitch -z -D <dgname> -h <node_to_switch_to><br /><span class="anchor" id="line-351"></span></node_to_switch_to></dgname></span>

Resyncing Device Groups

<span style="font-size:85%;">scconf -c -D name=<dgname>,sync<br /><span class="anchor" id="line-357"></span></dgname></span>

Changing DG configuration

<span style="font-size:85%;">scconf -c -D name=<dgname>,preferenced=<true|false>,failback=<enabled|disabled><br /><span class="anchor" id="line-363"></span></enabled|disabled></true|false></dgname></span>

Maintenance mode

<span style="font-size:85%;">scswitch -m -D <dgname><br /><span class="anchor" id="line-369"></span></dgname></span>

NOTE: all volumes in the dg must be unopened or unmounted (not being used) in order to do that.

To come back out of maintenance mode

<span style="font-size:85%;">scswitch -z -D <dgname> -h <new_primary_node><br /><span class="anchor" id="line-377"></span></new_primary_node></dgname></span>

Repairing DID device database after replacing JBOD disks

  • ‘Make sure you know which disk to update …’

<span style="font-size:85%;">scdidadm -l c1t1d0<br /><span class="anchor" id="line-384"></span></span>

returns node1:/dev/rdsk/c1t1d0 /dev/did/rdsk/d7

<span style="font-size:85%;">scdidadm -l d7<br /><span class="anchor" id="line-390"></span></span>

returns node1:/dev/rdsk/c1t1d0 /dev/did/rdsk/d7

Then use following cmds to update and verify the DID info:

<span style="font-size:85%;">scdidadm -R d7<br /><span class="anchor" id="line-398"></span>scdidadm -l -o diskid d7<br /><span class="anchor" id="line-399"></span></span>

returns a large string with disk id.

Replacing a failed disk in a A5200 Array (similar concept with other FC disk arrays)

<span style="font-size:85%;">vxdisk list - get the failed disk name<br /><span class="anchor" id="line-407"></span><br /><span class="anchor" id="line-408"></span>vxprint -g dgname -- determine state of the volume(s) that might be affected<br /><span class="anchor" id="line-409"></span></span>

On the hosting node, replace the failed disk:

<span style="font-size:85%;">luxadm remove enclosure,position<br /><span class="anchor" id="line-415"></span>luxadm insert enclosure,position<br /><span class="anchor" id="line-416"></span></span>

On either node of the cluster (that hosts the dg):

<span style="font-size:85%;">scdidadm -l c#t#d#<br /><span class="anchor" id="line-422"></span>scdidadm -R d#<br /><span class="anchor" id="line-423"></span></span>

On the hosting node:

<span style="font-size:85%;">vxdctl enable<br /><span class="anchor" id="line-429"></span><br /><span class="anchor" id="line-430"></span>vxdiskadm (replace failed disk in vxvm)<br /><span class="anchor" id="line-431"></span><br /><span class="anchor" id="line-432"></span>vxprint -g <dgname><br /><span class="anchor" id="line-433"></span>vxtask list     #ensure that resyncing is completed<br /><span class="anchor" id="line-434"></span></dgname></span>

Remove any relocated submirrors/plexes (if hot-relocation had to move something out of the way):

<span style="font-size:85%;">vxunreloc repaired-diskname<br /><span class="anchor" id="line-441"></span></span>

Solaris Vol Mgr (SDS) in Sun Clustered Env

Preferred method of using Soft partitions is to use single slices to create mirrors and then create volumes (soft partitions) from that (kind of similar to VxVM public region in an initialized disk).

Shared Disksets and Local Disksets

Only disks that are physically located in the multi-ported storage will be members of shared disksets. Only disks that are in the same diskset operate as a unit; they can be used together to build mirrored volumes, and primary ownership of the diskset transfers as a while from node to node.

Boot disks are the local disksets. This is a pre-requisite in order to have shared disksets.

Replica management

  • Add local replicas manually.
  • Put local state db replicas on slice 7 of disks (as a convention) in order to maintain uniformity. Shared disksets have to have replicas on slice 7.
  • Spread local replicas evenly across disks and controllers.
  • Support for Shared disksets is provided by Pkg SUNWmdm

Modifying /kernel/drv/md.conf

<span style="font-size:85%;">nmd == max num of volumes (default 128)<br /><span class="anchor" id="line-464"></span>md_nsets == max is 32, default 4.<br /><span class="anchor" id="line-465"></span></span>

Creating shared disksets and mediators

<span style="font-size:85%;">scdidadm -l c1t3d0<br /><span class="anchor" id="line-471"></span></span>

  • — returns d17 as DID device

<span style="font-size:85%;">scdidadm -l d17<br /><span class="anchor" id="line-475"></span>metaset -s <disksetname> -a -h <node1> <node2>  # creates metaset<br /><span class="anchor" id="line-476"></span>metaset -s <disksetname> -a -m <node1> <node2>  # creates mediator<br /><span class="anchor" id="line-477"></span>metaset -s <disksetname> -s /dev/did/rdsk/d9 /dev/did/rdsk/d17<br /><span class="anchor" id="line-478"></span>metaset # returns values<br /><span class="anchor" id="line-479"></span>metadb -s <disksetname><br /><span class="anchor" id="line-480"></span>medstat -s <disksetname> (reports mediator status)<br /><span class="anchor" id="line-481"></span></disksetname></disksetname></disksetname></node2></node1></disksetname></node2></node1></disksetname></span>

Remaining syntax vis-a-vis Sun Cluster is identical to that for VxVM.

IPMP and sun cluster

IPMP is cluster un-aware. To work around that, Sun Cluster uses Cluster-specific public network mgr daemon (pnmd) to integrate IPMP into the cluster.

pmnd daemon has two capabilities:

  • populate CCR with public network adapter status
  • facilitate application failover

When pnmd detects all members of a local IPMP group have failed, it consults a file called /var/cluster/run/pnm_callbacks. This file contains entries that would have been created by the activation of Log icalHostname and SharedAddress resources. It is the job of hafoip_ipmp_callback to device whether to migrate resources to another node.

<span style="font-size:85%;">scstat -i       #view IPMP configuration</span><br />

 Posted by at 8:50 pm
Mar 092007
 

Leveraging Centralized SSH2 based trusts to monitor network interface status on solaris servers

Since SSH2 key-based trusts have been established in this landscape (at root level), the automation of a variety of tasks becomes easily achievable. The SSH2 key-based trust ensures secure and encrypted transport mechanism (that reinforces security-oriented approach to system administration). Leveraging tools such as sudo (1m) or powerbroker an additional layer of security and auditability can be added.

Using TLRC and ndd_get.sh to collect Network-related information

The following two scripts can be used to make network interface related metrics collections.

tlrc.pl (Test Login Run Command) is a perl script that reads input from a colon-separated text file (of very specific format) or from the command-line and can execute any command on the remote host(s) specified with STDOUT/STDERR logging, etc.

tlrc.pl (test login run command) --

#!/usr/bin/env perl

use Getopt::Std;
use Net::Ping;

my %Args;

getopts( ‘l:i:c:o:n:adT:th’, \%Args );

if ( $Args{h} ) {
&printUsage &amp;amp;& exit 0;
}

my $hlist = $Args{i} || “/path/to/inventory.txt”;
my $ssh = “/usr/bin/ssh”;

my $rsh = “/usr/bin/rsh”;
my $p = Net::Ping->new();
my $lid = $Args{l} || “nobody”;
my $outfile = $Args{o} || “tlrc.out”;
my @shlcmd = $Args{c};
my $conprot = $Args{T} or “ssh”;

if ( $conprot = “rsh|remsh|rlogin” ) {
$conprot = “rsh”;
}
else {
&printUsage &amp;amp;& exit 1;
}

open( RHL, “< $hlist" ) or die "Unable to open input file $hlist: $! \n"; @rhl = ;
close(RHL);
open( WOF, “|tee $outfile” )
or die “Unable to open output file $outfile for writes: $! \n”;
open( WHL, “>> hlist.tlrc” );

if ( $Args{c} ) {
die “Can’t execute $Args{c} with the \”-t\” switch \n”
if ( ( $Args{t} or $Args{d} ) );
runCmd(@shlcmd);
}

if ( $Args{d} ) {
die “Can’t munge dmesg and run login tests at the same time! \n”
if $Args{t};
&dmesgMunger;
}

if ( $Args{t} ) {
&loginTest;
}

sub printUsage {
print
“Usage: $0 [ -l <> ][ -i ][ -c ][ -n ]|[ -a ]|[ -t ]|[ -h ] \n”;
print
“\t -l — pass the login name you want to use for this session \n
\t -i — pass the input file (colon-delimited) with list of hosts an
d pingability status \n
\t -c — quoted Command you want to run remotely \n
\t -n — comma delimited list of hosts you want to run remote c
ommand specified with \”cmdstring\” on \
\t -a — specifies all hosts in input file to run remote command specified with
\”cmdstring\” on \
\t -T — specifies the Connection type — ssh or rsh \
\t -t — Optional switch to the -c or -h switches, it will only run testing port
ion of the script \
\t -h — print this message \n”;
}

sub runCmd {

my @cmdstring = @_;

if ( $Args{a} ) {
foreach $line (@rhl) {
next if ( $line =~ m/^#/ );
next if ( $line =~ m/^$/ );
my ( $name, $domain, $ip, $pstate, $canlogin, $contype, $serial,
$hid, $usage )
= split( ‘:’, $line );
chomp( $name, $domain, $ip, $pstate, $canlogin, $contype, $serial,
$hid, $usage );
if ( $pstate == 0 ) {
if ( $canlogin == 0 ) {
if ( $contype == 0 ) {
ssh_cmd( $lid, $name, @cmdstring );
}
elsif ( $contype == 1 ) {
rsh_cmd( $lid, $name, @cmdstring );
}
else {
print “Cannot understand connection type! \n”;
}
}
else {
print “Cannot log into the server! \n”;
}
}
else {
print “$name is unpingable — can’t reach! \n”;
}
}
}
elsif ( $Args{n} ) {
$hlist = $Args{n};
@hostlist = split( ‘ ‘, $hlist );
foreach $name (@hostlist) {
if ( $conprot = “ssh” ) {
ssh_cmd( $lid, $name, @cmdstring );
}
elsif ( $conprot = “rsh” ) {
rsh_cmd( $lid, $name, @cmdstring );
}
else {
die “Unknown Option with \”-T\” switch! \n”;
}
}
}
}

sub ssh_cmd {

my ( $id, $host, @cmd ) = @_;
print “$ssh $id\@$host ‘@cmd’ \n”;
@sshout = qx/$ssh $id\@$host ‘@cmd’/;

#or die “Can’t run cmd : $! \n”;
print WOF “$host \n”;
print WOF “@sshout \n”;
}

sub rsh_cmd {

my ( $id, $host, @cmd ) = @_;
print “$rsh -l $id $host ‘@cmd’ \n”;

@rshout = qx/$rsh -l $id $host ‘@cmd’ /;

#or die “can’t run $rsh -l $id $host ‘@cmd’ : $! \n”;

print WOF “$host \n”;
print WOF “@rshout \n”;
}

sub dmesgMunger {

&getToday;
&runCmd(
“cat /var/adm/messages|grep \”$today\”|egrep -v \”vas|auth\|lw8\|mail.info\|Wait
ing\|Networker savegroup\|local1|checked|wrap|Normal\”|egrep -i \”scsi|disk|err|
fatal|pers|mem|link|fcp|AFT|ASFR|PSYND|ESYND|full|vx_nospace|vxfs|vxvm\””
);
}

sub getToday {
my ( $sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst ) =
localtime(time);
chomp( $sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst );
$year += 1900;
$mon += 1;

my %months = (
1 => ‘Jan’,
2 => ‘Feb’,
3 => ‘Mar’,
4 => ‘Apr’,
5 => ‘May’,
6 => ‘Jun’,
7 => ‘Jul’,
8 => ‘Aug’,
9 => ‘Sep’,
10 => ‘Oct’,
11 => ‘Nov’,
12 => ‘Dec’,
);

if ( $mday < mday = " $mday" today = "$months{$mon} $mday" line ="~" pstate ="=" npstate =" $p-">ping( $name, 1 );
if ( $npstate == 0 ) {

#$p->close();
print “Running \”$ssh $lid\@$name\”…\n”;
my @sshout =
system( “$ssh”, “-l”, “$lid”, “$name”, “\’exit\'” );
$exitval = $? >> 8;
chomp $exitval;
print WOF
“attempt to log into $name ended with status $exitval \n”;
print WHL “$name:$domain:$ip:$pstate:$exitval\n”;
}
else {
print WOF “Unable to ping $name \n”;
}
}
if ( $pstate == 1 ) {
print
“$hlist says $name is inaccessible.\nBut I will try to ping $name again anyway..
.\n”;
my $npstate = $p->ping( $name, 1 );
if ( $npstate == 0 ) {

#$p->close();
print “Running \”$ssh $lid\@$name\”…\n”;
my @sshout =
system( “$ssh”, “-l”, “$lid”, “$name”, “\’exit;\'” );
$exitval = $? >> 8;
chomp $exitval;
print WOF
“attempt to log into $name ended with status $exitval \n”;
print WHL “$name:$domain:$ip:$pstate:$exitval\n”;
}
else {
print WOF “Unable to ping $name \n”;
}
}
}
close(WOF);
}

inventory.txt (the input file passed to tlrc.pl) --

#HOSTNAME:DOMAIN NAME:IP:PINGABLE(1 == no; 0 == yes):Login(1 == no;0 == yes):Connection(0=ssh:1=telnet/rsh):SERIAL:HOSTID:USAGE(P|NP)


This is a colon-delimited file with fields as listed above. Not all of them are required for running the script, but can be useful in certain cases (eg: hostid, serial #).

ndd_get.sh is a Korn-shell based script that returns the NIC link-related statistics in a comma-separated output format.

#!/usr/bin/env ksh

NDD=/usr/sbin/ndd
ID=`/usr/xpg4/bin/id -u`
HOSTNAME=`/usr/bin/hostname`

printUsage() {
echo “Usage: $0 [ -a ]|[-n -i ]|[-h] \n”;
}

splitter() {
interface=$1
INSTANCE=`echo $interface|awk -F\e ‘{print $2}’`
BASEDEV=`echo $interface|awk -F\e ‘{print $1}’`
ADAPTER=”$BASEDEV”e
}

macipget() {
IF=$1
IFCONFIG=/usr/sbin/ifconfig
IP=`$IFCONFIG $IF|grep inet|awk ‘{print $2}’`
MAC=`$IFCONFIG $IF|grep ether|awk ‘{print $2}’`
}

nddget() {
#set -x
AD=$1
INST=$2
$NDD -set /dev/$AD instance $INST
LSTAT=`$NDD -get /dev/$AD link_status`
LSPEED=`$NDD -get /dev/$AD link_speed`
LMODE=`$NDD -get /dev/$AD link_mode`
IS_100FDX=`$NDD -get /dev/$AD adv_100fdx_cap`
IS_100HDX=`$NDD -get /dev/$AD adv_100hdx_cap`
IS_10FDX=`$NDD -get /dev/$AD adv_10fdx_cap`
IS_10HDX=`$NDD -get /dev/$AD adv_10hdx_cap`
AUTONEG=`$NDD -get /dev/$AD adv_autoneg_cap`
LP_100FDX=`$NDD -get /dev/$AD lp_100fdx_cap`
LP_100FDX=`$NDD -get /dev/$AD lp_100hdx_cap`
LP_10FDX=`$NDD -get /dev/$AD lp_10fdx_cap`
LP_10HDX=`$NDD -get /dev/$AD lp_10hdx_cap`
LP_AUTONEG=`$NDD -get /dev/$AD lp_autoneg_cap`
if [ $LSTAT -eq 0 ]; then
linkstat=”down”
else
linkstat=”up”
fi
if [ $LSPEED -eq 0 ]; then
linkspeed=”10″
else
linkspeed=”100″
fi
if [ $LMODE -eq 0 ]; then
linkmode=”Half Duplex”
else
linkmode=”Full Duplex”
fi
if [ $AUTONEG -eq 0 ]; then
autoneg=”Off”
else
autoneg=”on”
fi
if [ $LP_AUTONEG -eq 0 ]; then
lp_autoneg=”Off”
else
lp_autoneg=”On”
fi
IF=$AD$INST
macipget $IF
print “$HOSTNAME,$IF,$IP,$MAC,$linkstat,$linkspeed,$linkmode,$autoneg,$lp_au
toneg”
}

kstatget() {
#set -x
AD=$1
INST=$2

linkspeed=`/usr/bin/kstat -p $AD|grep -i link_|\
grep “$AD:$INST”|grep link_speed|awk ‘{print $2}’`

is_up=`/usr/bin/kstat -p $AD|grep -i link_|\
grep “$AD:$INST”|grep link_up| awk ‘{print $2}’`
if [ $is_up -eq 1 ]; then
linkstat=”UP”
else
linkstat=”DOWN”
fi
LINK_MODE=`/usr/bin/kstat -p $AD|grep -i link_|\
grep $AD:$INST|grep link_duplex|awk ‘{print $2}’`
case $LINK_MODE in
2) linkmode=”Full Duplex”;;
1) linkmode=”Half Duplex”;;
*) linkmode=”Unknown”;;
esac

$NDD -set /dev/$AD instance $INST
AUTONEG=`$NDD -get /dev/$AD adv_autoneg_cap`
LP_AUTONEG=`/usr/bin/kstat -p $AD|\
grep $AD:$INST|grep lp_cap_autoneg|awk ‘{print $2}’`
if [ $AUTONEG -eq 0 ]; then
autoneg=”Off”
else
autoneg=”On”
fi
if [ $LP_AUTONEG -eq 0 ]; then
lp_autoneg=”Off”
else
lp_autoneg=”On”
fi
IF=$AD$INST
macipget $IF
print “$HOSTNAME,$IF,$IP,$MAC,$linkstat,$linkspeed,$linkmode,$autoneg,$lp_au
toneg”

}

bgekstatget() {
#set -x
AD=$1
INST=$2

linkspeed=`/usr/bin/kstat -m $AD -i $INST -n parameters|\
grep -i link_| grep link_speed|awk ‘{print $2}’`

is_up=`/usr/bin/kstat -m $AD -i $INST -n parameters|\
grep -i link_|grep link_status| awk ‘{print $2}’`
if [ $is_up -eq 1 ]; then
linkstat=”UP”
else
linkstat=”DOWN”
fi
LINK_MODE=`/usr/bin/kstat -m $AD -i $INST -n parameters|\
grep -i link_|grep link_duplex|awk ‘{print $2}’`
case $LINK_MODE in
2) linkmode=”Full Duplex”;;
1) linkmode=”Half Duplex”;;
*) linkmode=”Unknown”;;
esac

AUTONEG=`/usr/bin/kstat -m $AD -i $INST -n parameters|\
grep -i link_|grep autoneg|awk ‘{print $2}’`
LP_AUTONEG=`/usr/bin/kstat -m $AD -i $INST -n parameters|\
grep lp_| grep autoneg |awk ‘{print $2}’`
if [ $AUTONEG -eq 0 ]; then
autoneg=”Off”
else
autoneg=”On”
fi
if [ $LP_AUTONEG -eq 0 ]; then
lp_autoneg=”Off”
else
lp_autoneg=”On”
fi

IF=$AD$INST
macipget $IF
print “$HOSTNAME,$IF,$IP,$MAC,$linkstat,$linkspeed,$linkmode,$autoneg,$lp_au
toneg”

}

dmfeget() {

AD=$1
INST=$2
EADAPT=$AD$INST
#$NDD -set /dev/$EADAPT
# NOte the ndd set is not required since dmfe interfaces are directly
# set up as device files (such as /dev/dmfe0, /dev/dmfe1)

LSTAT=`$NDD -get /dev/$EADAPT link_status`
LSPEED=`$NDD -get /dev/$EADAPT link_speed`
LMODE=`$NDD -get /dev/$EADAPT link_mode`
IS_100FDX=`$NDD -get /dev/$EADAPT adv_100fdx_cap`
IS_100HDX=`$NDD -get /dev/$EADAPT adv_100hdx_cap`
IS_10FDX=`$NDD -get /dev/$EADAPT adv_10fdx_cap`
IS_10HDX=`$NDD -get /dev/$EADAPT adv_10hdx_cap`
AUTONEG=`$NDD -get /dev/$EADAPT adv_autoneg_cap`
LP_AUTONEG=`$NDD -get /dev/$ADAPT lp_autoneg_cap`
if [ $LSTAT -eq 0 ]; then
linkstat=”down”
else
linkstat=”up”
fi
if [ $LSPEED -eq 0 ]; then
linkspeed=”10″
else
linkspeed=”100″
fi
if [ $LMODE -eq 0 ]; then
linkmode=”Half Duplex”
else
linkmode=”Full Duplex”
fi
if [ $AUTONEG -eq 0 ]; then
autoneg=”Off”
else
autoneg=”on”
fi
if [ $LP_AUTONEG -eq 0 ]; then
lp_autoneg=”Off”
else
lp_autoneg=”On”
fi
macipget $EADAPT

print “$HOSTNAME,$EADAPT,$IP,$MAC,$linkstat,$linkspeed,$linkmode,$autoneg,$l
p_autoneg”

}

getParms() {
#set -x
case $ADAPTER in
qfe) nddget $ADAPTER $INSTANCE;;
hme) nddget $ADAPTER $INSTANCE;;
eri) nddget $ADAPTER $INSTANCE;;
ce) kstatget $ADAPTER $INSTANCE;;
bge) bgekstatget $ADAPTER $INSTANCE;;
dmfe) dmfeget $ADAPTER $INSTANCE;;
*) echo “Error: Unknown adapter! \n” &&amp;amp; exit 1;;
esac
}

nicStatAll() {
#set -x
/usr/sbin/ifconfig -a|nawk ‘/UP/{print $1}’|egrep -v “lo0|clprivnet”| \
awk -F: ‘{print $1}’ |sort -nr|uniq > /tmp/iflist;
for interface in `cat /tmp/iflist`
do
if [ $interface = “:*” ]; then
next
fi
# Deprecated code — left behind for old time’s sake
#count=`echo $interface|wc -m|sed -e”s!^[ /t]!!g”`
#count1=`expr $count – 2`
#count2=`expr $count – 1`
#int=`echo $interface|cut -c 1-${count1}`
#dev=/dev/${int}
#inst=`echo $interface|cut -c ${count2}`
case $interface in
eri*) INSTANCE=`echo $interface|awk -F\i ‘{print $2}’`
BASEDEV=`echo $interface|awk -F\i ‘{print $1}’`
ADAPTER=”$BASEDEV”i;;
*) splitter $interface;;
esac
getParms
done
}

if [ $ID -ne 0 ]; then
echo “ERROR: You are not root! Only root can run this script!\n”;
exit 1;
fi

while getopts an:i:h arg
do
case $arg in
a) nicStatAll &&amp;amp; exit 0;;
n) ADAPTER=${OPTARG};;
i) INSTANCE=${OPTARG};;
h) printUsage &&amp;amp; exit 0;;
*) printUsage &&amp;amp; exit 1;;
esac
done
shift $(($OPTIND – 1))

if [ ! -z ${ADAPTER} ]; then
if [ ! -z ${INSTANCE} ]; then
getParms
else
printUsage && exit 1
fi
else
printUsage && exit 1
fi

On the centralized management host (whose SSH2-based Key is trusted by the monitored hosts) run the following command to perform the inventory:

admin:(dev) $ sudo ./tlrc.pl -l root -a \<br />-c "/path/to/nddget.sh -a" \<br />-o ~/logs/ndd_get_today.txt<br /><span class="anchor" id="line-45"></span><br />/usr/bin/ssh root@host1 '/path/to/ndd_get.sh -a'<br />/usr/bin/ssh root@host2 '/path/to/ndd_get.sh -a'<br />host1<br />host1,bge2,IP,MAC,UP,100,Full Duplex,On,On<br />host1,bge1,IP,MAC,UP,100,Full Duplex,On,On<br />host1,bge0,IP,MAC,UP,100,Full Duplex,On,On<br /><span class="anchor" id="line-54"></span><br />host2<br />host2,bge2,10.228.147.62,0:3:ba:49:45:51,UP,100,Full Duplex,On,On<br />host2,bge1,10.228.143.62,0:3:ba:49:45:50,UP,100,Full Duplex,On,On<br />host2,bge0,10.228.139.62,0:3:ba:49:45:4f,UP,100,Full Duplex,On,On<br /><span class="anchor" id="line-60"></span><br />/usr/bin/rsh -l root host3 '/path/to/ndd_get.sh -a'<br />/usr/bin/rsh -l root host4 '/path/to/ndd_get.sh -a'<br /><br />host3<br />host3,qfe1,IP,MAC,up,100,Full Duplex,Off,Off<br />host3,qfe0,IP,MAC,up,100,Full Duplex,Off,Off<br />host3,ce0,IP,MAC,UP,1000,Full Duplex,On,On<br /><span class="anchor" id="line-78"></span><br /><truncated><br /><span class="anchor" id="line-92"></span></truncated>

Look at the text output created thus:

admin:(logs) $ more ndd_get_today.txt<br /><br />host1<br />host1,bge2,IP,MAC,UP,100,Full Duplex,On,On<br />host1,bge1,IP,MAC,UP,100,Full Duplex,On,On<br />host1,bge0,IP,MAC,UP,100,Full Duplex,On,On<br /><span class="anchor" id="line-105"></span><br />host2<br />host2,bge2,IP,MAC,UP,100,Full Duplex,On,On<br />host2,bge1,IP,MAC,UP,100,Full Duplex,On,On<br />host2,bge0,IP,MAC,UP,100,Full Duplex,On,On<br /><span class="anchor" id="line-110"></span><br /><truncated><br /><span class="anchor" id="line-123"></span></truncated>

Now look at the sudo log file to see if there’s associated logging captured.

admin:(log) $ sudo tail sudo.log<br /><span class="anchor" id="line-129"></span>Sep  5 16:32:36 : lahirdx : TTY=pts/27 ; PWD=/export/home/lahirdx/dev ;<br /><span class="anchor" id="line-130"></span>    USER=root ; COMMAND=/usr/bin/ssh aesdbc1<br /><span class="anchor" id="line-131"></span>Sep  6 09:49:31 : lahirdx : TTY=pts/30 ; PWD=/export/home/lahirdx/dev ;<br /><span class="anchor" id="line-132"></span>    USER=root ; COMMAND=./tlrc.pl -a -c /export/patches/Scripts/bin/ndd_get.sh<br /><span class="anchor" id="line-133"></span>    -a -o /export/home/lahirdx/logs/ndd_get_9606.txt<br /><span class="anchor" id="line-134"></span>Sep  6 09:49:40 : lahirdx : TTY=pts/30 ; PWD=/export/home/lahirdx/dev ;<br /><span class="anchor" id="line-135"></span>    USER=root ; COMMAND=./tlrc.pl -l root -a -c<br /><span class="anchor" id="line-136"></span>    /export/patches/Scripts/bin/ndd_get.sh -a -o /export/home/lahirdx/logs/ndd_get_9606.txt<br /><br /><span class="anchor" id="line-139"></span>

NOTE: Look at the full command line, who executed a particular command, when etc getting captured in the logs. Also, it is imperative to ensure that the “/path/to/ndd_get.sh” is the same on all the monitored hosts. This author recommends creating a system V package to deploy commonly used scripts and tools under /opt/tools (or similar directory structure) to ensure standardization of the environment.

 Posted by at 6:49 pm
Jan 022007
 

# more zonecfg-hints.txt
LOFS mount:

global# newfs /dev/rdsk/c1t0d0s0
global# mount /dev/dsk/c1t0d0s0 /mystuff
global# zonecfg -z my-zone
zonecfg:my-zone> add fs
zonecfg:my-zone:fs> set dir=/usr/mystuff
zonecfg:my-zone:fs> set special=/mystuff
zonecfg:my-zone:fs> set type=lofs
zonecfg:my-zone:fs> end
* Use a UFS mount:

global# newfs /dev/rdsk/c1t0d0s0
global# zonecfg -z my-zone
zonecfg:my-zone> add fs
zonecfg:my-zone:fs> set dir=/usr/mystuff
zonecfg:my-zone:fs> set special=/dev/dsk/c1t0d0s0
zonecfg:my-zone:fs> set raw=/dev/rdsk/c1t0d0s0
zonecfg:my-zone:fs> set type=ufs
zonecfg:my-zone:fs> end
* Export the device node and mount from the non-global zone:

global# zonecfg -z my-zone
zonecfg:my-zone> add device
zonecfg:my-zone:device> set match=/dev/rdsk/c1t0d0s0
zonecfg:my-zone:device> end
zonecfg:my-zone> add device
zonecfg:my-zone:device> set match=/dev/dsk/c1t0d0s0
zonecfg:my-zone:device> end
my-zone# newfs /dev/rdsk/c1t0d0s0
my-zone# mount /dev/dsk/c1t0d0s0 /usr/mystuff
* Mount the FS directly from the Global zone when the non-global zone is run
ning:

global# mount /dev/dsk/c1t0d0s0 /export/zones/zone1/root/mnt
* Using lofiadm

#

 Posted by at 8:08 pm