Jul 052013
 

It’s been a while since I’ve posted anything. So, here goes:

Identifying kernel memory leak on myhost-ldom (after P2V and OS upgrade from Solaris 9 to Solaris 10u10):

kmem_track.d sees this:

Tracing...If you see more allocs than frees, there is a potential issue...
Check against the cache name that is suspect

CACHE NAME                       ALLOCS   FREES
kmem_bufctl_audit_cache          0        47775
kmem_alloc_256                   0        87805
streams_dblk_1040                26072    0
kmem_alloc_40                    63752    0
vn_cache                         64883    64484
kmem_alloc_1152                  85338    82191
rctl_val_cache                   98712    99425
anonmap_cache                    105272   106039
kmem_alloc_96                    109072   108992
sfmmu8_cache                     109171   112886
kmem_alloc_32                    134058   135008
zio_cache                        146456   146456
streams_dblk_80                  162260   162274
kmem_alloc_160                   167740   167783
kmem_alloc_80                    187855   188146
sfmmu1_cache                     190247   194525
segvn_cache                      217514   218797
seg_cache                        232548   233831
kmem_alloc_8                     283391   283672
kmem_alloc_64                    286856   286354
streams_mblk                     313263   313281
anon_cache                       330058   336717
Tracing...If you see more allocs than frees, there is a potential issue...
Check against the cache name that is suspect

CACHE NAME                       ALLOCS   FREES
kmem_bufctl_audit_cache          0        47778
kmem_alloc_256                   0        87807
streams_dblk_1040                26216    0
kmem_alloc_40                    63777    0
vn_cache                         64887    64488
kmem_alloc_1152                  85383    82236
rctl_val_cache                   98787    99500
anonmap_cache                    105331   106098
kmem_alloc_96                    109075   108995
sfmmu8_cache                     109226   112967
kmem_alloc_32                    134132   135082
zio_cache                        146468   146468
streams_dblk_80                  162472   162486
kmem_alloc_160                   167875   167918
kmem_alloc_80                    187950   188241
sfmmu1_cache                     190362   194689
segvn_cache                      217628   218911
seg_cache                        232666   233949
kmem_alloc_8                     283452   283733
kmem_alloc_64                    286923   286421
streams_mblk                     313688   313706
anon_cache                       330176   336835
Tracing...If you see more allocs than frees, there is a potential issue...
Check against the cache name that is suspect

The two caches ( streams_dblk_1040 and kmem_alloc_40 are growing rapidly), associated with growth of kernel memory (validated by following output):

-----------------------------------
05-22-13-15-15
-----------------------------------
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                      99310               775   10%
ZFS File Data               31104               243    3%
Anon                        36342               283    4%
Exec and libs                4090                31    0%
Page cache                  20560               160    2%
Free (cachelist)             3905                30    0%
Free (freelist)            833012              6507   81%

Total                     1028323              8033
Physical                  1008210              7876

-----------------------------------
05-22-13-15-30
-----------------------------------
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     150859              1178   15%
ZFS File Data               65092               508    6%
Anon                        68299               533    7%
Exec and libs                9371                73    1%
Page cache                  22618               176    2%
Free (cachelist)            11104                86    1%
Free (freelist)            700980              5476   68%

Total                     1028323              8033
Physical                  1005627              7856
-----------------------------------

The kernel memory utilization has grown 400MB in a space of 15 minutes.

So, running dtrace on streams_dblk_1040 cache shows us this:

$ sudo dtrace -n 'fbt::kmem_cache_alloc:entry /args[0]->cache_name == "streams_dblk_1040"/ \
 { @[uid,pid,ppid,curpsinfo->pr_psargs,execname,stack()] = count(); trunc(@,10);}'
dtrace: description 'fbt::kmem_cache_alloc:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME

        0     2122        1  /usr/local/sbin/snmpd -Lsd -Lf /dev/null -p /var/run/snmpd.pid -a  snmpd
              genunix`allocb+0x94
              ip`snmpcom_req+0x2bc
              ip`ip_wput_nondata+0x7b0
              unix`putnext+0x218
              unix`putnext+0x218
              ip`snmpcom_req+0x37c
              ip`ip_snmpmod_wput+0xe4
              unix`putnext+0x218
              ip`snmpcom_req+0x37c
              ip`ip_snmpmod_wput+0xe4
              unix`putnext+0x218
              genunix`strput+0x1d8
              genunix`strputmsg+0x2d4
              genunix`msgio32+0x354
              genunix`putmsg32+0x98
              unix`syscall_trap32+0xcc
               11
        0     2122        1  /usr/local/sbin/snmpd -Lsd -Lf /dev/null -p /var/run/snmpd.pid -a  snmpd
              genunix`allocb+0x94
              ip`snmp_append_data2+0x70
              ip`tcp_snmp_get+0x5d4
              ip`snmpcom_req+0x350
              ip`ip_snmpmod_wput+0xe4
              unix`putnext+0x218
              ip`snmpcom_req+0x37c
              ip`ip_snmpmod_wput+0xe4
              unix`putnext+0x218
              genunix`strput+0x1d8
              genunix`strputmsg+0x2d4
              genunix`msgio32+0x354
              genunix`putmsg32+0x98
              unix`syscall_trap32+0xcc
               22
        0     2122        1  /usr/local/sbin/snmpd -Lsd -Lf /dev/null -p /var/run/snmpd.pid -a  snmpd
              genunix`allocb+0x94
              ip`snmp_append_data2+0x70
              ip`ip_snmp_get2_v4+0x310
              ip`ire_walk_ill_tables+0x30c
              ip`ire_walk_ipvers+0x64
              ip`ip_snmp_get_mib2_ip_route_media+0x74
              ip`ip_snmp_get+0x298
              ip`snmpcom_req+0x350
              ip`ip_wput_nondata+0x7b0
              unix`putnext+0x218
              unix`putnext+0x218
              ip`snmpcom_req+0x37c
              ip`ip_snmpmod_wput+0xe4
              unix`putnext+0x218
              ip`snmpcom_req+0x37c
              ip`ip_snmpmod_wput+0xe4
              unix`putnext+0x218
              genunix`strput+0x1d8
              genunix`strputmsg+0x2d4
              genunix`msgio32+0x354
               77

This is snmpd. I had already noted that snmpd had an issues (complained during boot up about configuration file entries).

Stopping net-smpd resulted in the kernel memory being released back into the freelist.

The other cache — kmem_alloc_40 shows this:

$ sudo dtrace -n 'fbt::kmem_cache_alloc:entry /args[0]->cache_name == "kmem_alloc_40"/ \
{ @[uid,pid,ppid,curpsinfo->pr_psargs,execname,stack()] = count(); trunc(@,10);}'
dtrace: description 'fbt::kmem_cache_alloc:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
        0      586        1  /opt/quest/sbin/vasd -p /var/opt/quest/vas/vasd/.vasd.pid  vasd
              genunix`kmem_zalloc+0x28
              zfs`dsl_dir_tempreserve_impl+0x1e0
              zfs`dsl_dir_tempreserve_space+0x128
              zfs`dmu_tx_try_assign+0x220
              zfs`dmu_tx_assign+0xc
              zfs`zfs_write+0x4b8
              genunix`fop_write+0x20
              genunix`write+0x268
              unix`syscall_trap32+0xcc
               39
        0      586        1  /opt/quest/sbin/vasd -p /var/opt/quest/vas/vasd/.vasd.pid  vasd
              genunix`kmem_zalloc+0x28
              zfs`dsl_dir_tempreserve_space+0x68
              zfs`dmu_tx_try_assign+0x220
              zfs`dmu_tx_assign+0xc
              zfs`zfs_write+0x4b8
              genunix`fop_write+0x20
              genunix`write+0x268
              unix`syscall_trap32+0xcc
               39
        0      586        1  /opt/quest/sbin/vasd -p /var/opt/quest/vas/vasd/.vasd.pid  vasd
              genunix`kmem_zalloc+0x28
              zfs`dsl_dir_tempreserve_space+0xd4
              zfs`dmu_tx_try_assign+0x220
              zfs`dmu_tx_assign+0xc
              zfs`zfs_write+0x4b8
              genunix`fop_write+0x20
              genunix`write+0x268
              unix`syscall_trap32+0xcc
               39
        0      586        1  /opt/quest/sbin/vasd -p /var/opt/quest/vas/vasd/.vasd.pid  vasd
              genunix`kmem_zalloc+0x28
              zfs`zfs_get_data+0x80
              zfs`zil_lwb_commit+0x188
              zfs`zil_commit_writer+0xbc
              zfs`zil_commit+0x90
              zfs`zfs_fsync+0xfc
              genunix`fop_fsync+0x14
              genunix`fdsync+0x20
              unix`syscall_trap32+0xcc
               40

Looks like vintela is writing to ZFS…

After shutting down snmpd, it seems like kernel stopped growing like crazy…

$ echo "::memstat"|sudo mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     151839              1186   15%
ZFS File Data               65746               513    6%
Anon                        58443               456    6%
Exec and libs                6927                54    1%
Page cache                  22475               175    2%
Free (cachelist)            13836               108    1%
Free (freelist)            709057              5539   69%

Total                     1028323              8033
Physical                  1005627              7856
[$:/var/adm (56)]
$ echo "::memstat"|sudo mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     135583              1059   13%
ZFS File Data               65876               514    6%
Anon                        55253               431    5%
Exec and libs                7202                56    1%
Page cache                  22475               175    2%
Free (cachelist)            13589               106    1%
Free (freelist)            728345              5690   71%

Total                     1028323              8033
Physical                  1005627              7856
[$:/var/adm (57)]
$ echo "::memstat"|sudo mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     135839              1061   13%
ZFS File Data               65941               515    6%
Anon                        55161               430    5%
Exec and libs                6909                53    1%
Page cache                  22477               175    2%
Free (cachelist)            13890               108    1%
Free (freelist)            728106              5688   71%

Total                     1028323              8033
Physical                  1005627              7856

It’s now hovering around 1000MB, down from ~ 1200MB..

05-24-2013 Revisit

anon_cache                       1162987  1161545
Tracing...If you see more allocs than frees, there is a potential issue...
Check against the cache name that is suspect

CACHE NAME                       ALLOCS   FREES
kmem_alloc_16                    0        204696
kmem_alloc_128                   111705   111703
vn_cache                         182607   0
kmem_alloc_256                   186757   186900
file_cache                       217169   217300
kmem_alloc_40                    242674   242645
zio_cache                        279855   279855
kmem_alloc_1152                  295560   295553
rctl_val_cache                   354759   355428
sfmmu8_cache                     358797   360868
kmem_alloc_96                    370087   370046
anonmap_cache                    380465   380542
kmem_alloc_160                   474039   474385
kmem_alloc_32                    476749   476818
kmem_alloc_8                     588250   587960
kmem_alloc_80                    674765   675200
sfmmu1_cache                     683800   684676
segvn_cache                      785621   785685
kmem_alloc_64                    789370   788307
seg_cache                        839840   839904
anon_cache                       1163105  1161663

Only one stands out — vn_cache. Therefore, run following dtrace command:

$ sudo dtrace -qn 'fbt::kmem_cache_alloc:entry /args[0]->cache_name == "vn_cache"/  \
{ @[uid,pid,ppid,curpsinfo->pr_psargs,execname,stack()] = count(); trunc(@,10);} \
profile:::tick-15sec { printa(@); }'

This will show stack trace aggregations of top 10 processes entering kmem_cache_alloc at a 15 second interval. Turns out it is HPOV:

        0     2104        1  /opt/perf/bin/perfd                                 perfd
              genunix`vn_alloc+0xc
              procfs`prgetnode+0x38
              procfs`pr_lookup_procdir+0x94
              procfs`prlookup+0x198
              genunix`fop_lookup+0x28
              genunix`lookuppnvp+0x30c
              genunix`lookuppnat+0x120
              genunix`lookupnameat+0x5c
              genunix`vn_openat+0x168
              genunix`copen+0x260
              unix`syscall_trap32+0xcc
             1087
        0     2104        1  /opt/perf/bin/perfd                                 perfd
              genunix`vn_alloc+0xc
              procfs`prgetnode+0x38
              procfs`prgetnode+0x110
              procfs`pr_lookup_procdir+0x94
              procfs`prlookup+0x198
              genunix`fop_lookup+0x28
              genunix`lookuppnvp+0x30c
              genunix`lookuppnat+0x120
              genunix`lookupnameat+0x5c
              genunix`vn_openat+0x168
              genunix`copen+0x260
              unix`syscall_trap32+0xcc
             1087
        0     2046        1  /opt/perf/bin/scopeux                               scopeux
              genunix`vn_alloc+0xc
              procfs`prgetnode+0x38
              procfs`pr_lookup_procdir+0x94
              procfs`prlookup+0x198
              genunix`fop_lookup+0x28
              genunix`lookuppnvp+0x30c
              genunix`lookuppnat+0x120
              genunix`lookupnameat+0x5c
              genunix`vn_openat+0x168
              genunix`copen+0x260
              unix`syscall_trap32+0xcc
             2168
        0     2046        1  /opt/perf/bin/scopeux                               scopeux
              genunix`vn_alloc+0xc
              procfs`prgetnode+0x38
              procfs`prgetnode+0x110
              procfs`pr_lookup_procdir+0x94
              procfs`prlookup+0x198
              genunix`fop_lookup+0x28
              genunix`lookuppnvp+0x30c
              genunix`lookuppnat+0x120
              genunix`lookupnameat+0x5c
              genunix`vn_openat+0x168
              genunix`copen+0x260
              unix`syscall_trap32+0xcc

Which lines up with the output of the heap/stack growth (hpstckgrow.d) script —

Tracing...Ctrl-C to exit
Tracking processes that are growing their heap size...
aggregation printed at 60s intervals
EXEC     PID      COMMAND                                  COUNT
kstat    23772    /usr/perl5/bin/perl /usr/bin/kstat -n dnlcstats 412
kstat    26262    /usr/perl5/bin/perl /usr/bin/kstat -n dnlcstats 412
kstat    28069    /usr/perl5/bin/perl /usr/bin/kstat -n dnlcstats 412
kstat    647      /usr/perl5/bin/perl /usr/bin/kstat -p unix:::boot_time 1028
kstat    6674     /usr/perl5/bin/perl /usr/bin/kstat -p unix:::boot_time 1028
kstat    24090    /usr/perl5/bin/perl /usr/bin/kstat -p unix:::boot_time 1028
kstat    1410     /usr/perl5/bin/perl /usr/bin/kstat -p unix:::boot_time 1030
facter   395      /opt/puppet/bin/ruby /opt/puppet/bin/facter --puppet --yaml 2106
facter   6561     /opt/puppet/bin/ruby /opt/puppet/bin/facter --puppet --yaml 2106
facter   23793    /opt/puppet/bin/ruby /opt/puppet/bin/facter --puppet --yaml 2107

Tracking processes that are growing their stack size...
aggregation printed at 60s intervals
EXEC     PID      COMMAND                                  COUNT
zstatd   27933    /opt/sun/xvm/lib/zstat/zstatd            1
zstatd   28306    /opt/sun/xvm/lib/zstat/zstatd            1
zstatd   28763    /opt/sun/xvm/lib/zstat/zstatd            1
zstatd   28918    /opt/sun/xvm/lib/zstat/zstatd            1
zstatd   29227    /opt/sun/xvm/lib/zstat/zstatd            1
zstatd   29491    /opt/sun/xvm/lib/zstat/zstatd            1
zstatd   29664    /opt/sun/xvm/lib/zstat/zstatd            1
java     952      /usr/jdk/instances/jdk1.6.0/bin/java -Xmx150m -Xss192k -XX:MinHeapFreeRatio=10  5
scopeux  2046     /opt/perf/bin/scopeux                    126
perfd    2104     /opt/perf/bin/perfd                      1260
Uninstalling and reinstalling the net-snmpd and HP Openview packages ultimately stopped the memory leaks.
Links:
Both the scripts listed above are modifications/adaptations of code from Brendan Gregg’s excellent DTrace book —