Sometimes I ask myself “how did that work again”, so I decided to document this every time I have this feeling. With some links to the documentation, easy commands,… you got the picture.
First one today, new customer, new environment, to get some feeling with the cells, I used cellsrvstat.
Documentation reference (here ). Cellsrvstat is also part of the exawatcher on the cells.
A basic overview of the command. If you log on to the cells as root, it is in your $PATH. But in case you’re looking for it, it’s stored in /opt/oracle/cell<version>/cellsrv/bin/
So basics first, what can it do:
# cellsrvstat -h LRM-00101: Message 101 not found; No message file for product=ORACORE, facility=LRM Usage: cellsrvstat [-stat_group=<group name>,<group name>,] [-offload_group_name=<offload_group_name>,] [-database_name=<database_name>,] [-stat=<stat name>,<stat name>,] [-interval=<interval>] [-count=<count>] [-table] [-short] [-list] stat A comma separated list of short strings representing the stats. Default is all. (unless -stat is specified). The -list option displays all stats. Example: -stat=io_nbiorr_hdd,io_nbiowr_hdd stat_group A comma separated list of short strings representing stat groups. Default: all except database (unless -stat_group is specified). The -list option displays all stat groups. The valid groups are: io, mem, exec, net, smartio, flashcache, offload, database. Example: -stat_group=io,mem offload_group_name A comma separated list of short strings representing offload group names. Default: cellsrvstat -stat_group=offload (all offload groups unless -offload_group_name is specified). Example: -offload_group_name=SYS_121111_130502 database_name A comma separated list of short strings representing database group names. Default: cellsrvstat -stat_group=database (all databases unless -database_name is specified). Example: -database_name=testdb,proddb interval At what interval the stats should be obtained and printed (in seconds). Default is 1 second. count How many times the stats should be printed. Default is once. list List all metric abbreviations and their descriptions. All other options are ignored. table Use a tabular format for output. This option will be ignored if all metrics specified are not integer based metrics. short Use abbreviated metric name instead of descriptive ones. error_out An output file to print error messages to, mostly for debugging. In non-tabular mode, The output has three columns. The first column is the name of the metric, the second one is the difference between the last and the current value(delta), and the third column is the absolute value. In Tabular mode absolute values are printed as is without delta. cellsrvstat -list command points out the statistics that are absolute values [root@dm06celadm01 ~]#
So it can display all kind of information about your cell status, which can be helpful to see what’s going on. So let’s do the list: (warning: awful lot of info! But i’ll cut out some of the rows, but if you execute it, be prepared for a long list)
[root@dm06celadm01 ~]# cellsrvstat -list Statistic Groups: io Input/Output related stats mem Memory related stats exec Execution related stats net Network related stats smartio SmartIO related stats flashcache FlashCache related stats health Cellsrv health/events related stats offload Offload server related stats database Database related stats ffi FFI related stats lio LinuxBlockIO related stats mpp Reverse Offload related stats Sparse Sparse stats Statistics: [ * - Absolute values. Indicates no delta computation in tabular format] io_nbiorr_hdd Number of hard disk block IO read requests io_nbiowr_hdd Number of hard disk block IO write requests io_nbiorb_hdd Hard disk block IO reads (KB) io_nbiowb_hdd Hard disk block IO writes (KB) io_nbiorr_flash Number of flash disk block IO read requests io_nbiowr_flash Number of flash disk block IO write requests io_nbiorb_flash Flash disk block IO reads (KB) io_nbiowb_flash Flash disk block IO writes (KB) io_ndioerr Number of disk IO errors io_ltow Number of latency threshold warnings during job io_ltcw Number of latency threshold warnings by checker io_ltsiow Number of latency threshold warnings for smart IO io_ltrlw Number of latency threshold warnings for redolog writes ... mpp_nr_blcc Num of reqs not pushed due to low cell cpu (C) mpp_nr_bhcon Num of reqs not pushed due to high cell outnet (C) mpp_nr_bhrnin Num of reqs not pushed due to high db node innet (C) mpp_nincr_mb Num rate increase by reverse offload info from db (C) mpp_ndecr_mb Num rate decrease by reverse offload info from db (C) mpp_nincr_rn Num rate increases from db node cpu information (C) mpp_ndecr_rn Num rate decreases from db node cpu information (C) mpp_ndecr_ccpu Num rate decreases from low cell cpu utilization (C) mpp_ndecr_con Num rate decreases from high cell outnet util (C) mpp_ndecr_rn_in Num rate decreases from high db node innet util (C) sparse_ncb num buckets compacted by sparse HT background scan sparse_ios num IOs with sparse regions sparse_ios_kb Total sparse IOs (KB) sparse_smartio Total redirected smart ios (KB) [root@dm06celadm01 ~]#
Let’s say you’re only interested in the io related things you could use a stat_group:
[root@dm06celadm01 ~]# cellsrvstat -stat_group io ===Current Time=== Tue Feb 21 11:29:39 2017 == Input/Output related stats == Number of hard disk block IO read requests 0 2226820445 Number of hard disk block IO write requests 0 1033312850 Hard disk block IO reads (KB) 0 1909110664882 Hard disk block IO writes (KB) 0 199121447989 Number of flash disk block IO read requests 0 14301322886 Number of flash disk block IO write requests 0 1008668696 Flash disk block IO reads (KB) 0 789129901568 Flash disk block IO writes (KB) 0 52097067586 Number of disk IO errors 0 0 Number of latency threshold warnings during job 0 1081 Number of latency threshold warnings by checker 0 0 Number of latency threshold warnings for smart IO 0 0 Number of latency threshold warnings for redolog writes 0 0 Current read block IO to be issued (KB) 0 0 Total read block IO to be issued (KB) 0 599867955384 Current write block IO to be issued (KB) 0 0 Total write block IO to be issued (KB) 0 197822797002 Current read blocks in IO (KB) 0 0 Total read block IO issued (KB) 0 599867955384 Current write blocks in IO (KB) 0 0 Total write block IO issued (KB) 0 197822797002 Current read block IO in network send (KB) 0 0 Total read block IO in network send (KB) 0 599867955384 Current write block IO in network send (KB) 0 0 Total write block IO in network send (KB) 0 197822797002 Current block IO being populated in flash (KB) 0 2765920 Total block IO KB populated in flash (KB) 0 32844047616 I/Os queued in IORM for hard disks 0 0 I/Os queued in IORM for flash disks 0 0 [root@dm06celadm01 ~]#
Last 2 lines are also very interesting, it tells you if IORM is kicking in or not. Might be usefull in some cases. Just saying.
The exec group is also nice. Once again I will cut out some rows, but the last lines are very interesting as well:
[root@dm06celadm01 ~]# cellsrvstat -stat_group exec ===Current Time=== Tue Feb 21 11:30:17 2017 == Execution related stats == Incarnation number 0 3 Number of module version failures 0 0 Number of threads working 0 2 Number of threads waiting for network 0 23 Number of threads waiting for resource 0 9 Number of threads waiting for a mutex 0 112 Number of Jobs executed for each job type CacheGet 0 3123536972 CachePut 0 1031998876 CloseDisk 0 15376502 OpenDisk 0 20379160 ProcessIoctl 0 304858117 PredicateDiskRead 0 7462707 PredicateDiskWrite 0 36539 PredicateFilter 0 24054836 PredicateCacheGet 0 140219901 PredicateCachePut 0 16917010 FlashCacheMetadataWrite 0 0 RemoteListenerJob 0 0 CacheBackground 0 0 RemoteCellMgrService 0 0 CopyFromRemote 0 30925 ... sparse_bootstrap 0 0 sparse_free_region 0 0 DelegateIO 0 62678 NetworkPoll 0 0 CopySIFromRemote 0 550 SIGetJob 0 720 NetworkDirectoryGC 0 0 SQL ids consuming the most CPU INT99 dxpwsgys5za27 3 END SQL ids consuming the most CPU [root@dm06celadm01 ~]#
This tells me which database is asking the most cpu for which query. Might be usefull in some cases. Remember… in an idle environment and you do something, … then you’re automatically the “top”. But if suspecting things, it’s worth to have a look, it might help.
As always, questions, remarks? find me on twitter @vanpupi