HP-UX Kernel Tuning and Performance Guide - 2000.3.15
2011.08.30 22:08
원문 : http://www.ischo.net -- 조인상 // 시스템 엔지니어
Writer : http://www.ischo.net -- ischo // System Engineer in Replubic Of Korea
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
원문 : http://www.ischo.net -- 조인상 //시스템 엔지니어
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Any user can execute top and Glance.
HP-UX Kernel Configuration
-
Configuring Kernel Parameters in 9.X
- cd /etc/conf
- cp dfile dfile.old
- vi dfile
- Modify the dfile to include the kernel parameters and values suggested above.
- config dfile
- make -f config.mk
- mv /hp-ux /hp-ux.old
- mv /etc/conf/hp-ux /hp-ux
- cd / ; shutdown -ry 0
-
Configuring Kernel Parameters in 10.X
- cd /stand/build
- /usr/lbin/sysadm/system_prep -s system
- vi system
- Either add or modify the entries to match:
- hpux_aes_override 1
- mk_kernel -s system
- mv /stand/system /stand/system.prev
- mv /stand/build/system /stand/system
- mv /stand/vmunix /stand/vmunix.prev
- mv /stand/build/vmunix_test /stand/vmunix
- cd / ; shutdown -ry 0
To configure the remaining kernel parameters with SAM, follow these steps:
- Login to the system as root
- Place the list of kernel parameter values (above) in the file:
- /usr/sam/lib/kc/tuned/stuff.tune
- Start SAM by typing the command: sam
- With the mouse, double-click on Kernel Configuration .
- On the next screen, double-click on Configurable Parameters.
- SAM will display a screen with a list of all configurable parameters and their current and pending values. Click on the Actions selection on the menu bar and select Apply Tuned Parameter Set ... on the pull-down menu. Select STUFF Applications from the list and click on the OK button.
- Click on the Actions selection on the menu bar and select Create A New Kernel. A confirmation window will be displayed warning you that a reboot is required. Click on YES to proceed.
- SAM will build the new kernel and then display a form with two options:
- Move Kernel Into Place and Reboot the System Now
- Exit Without Moving the Kernel Into Place
- If you select the first option and then click on OK, the new kernel will be moved into place and the system will be automatically rebooted.
- If you select the second option move the kernel from the /stand/build directory into the /stand/vmunix
-
Configurable Parameters
bufpages create_fastlinks fs_async hpux_aes_override maxdsiz maxfiles maxfiles_lim maxssiz maxswapchunks maxtsiz maxuprc netmemmax nfile ninode nproc npty |
Sets number of buffer pages Store symbolic link data in the inode Sets asynchronous write to disk Controls directory creation on automounted disk drives Limits the size of the data segment. Limits the soft file limit per process Limits the hard file limit per processes Limits the size of the stack segment. Limits the maximum number of swap chunks Limits the size of the text (code) segment. Limits the maximum number of user processes Sets the network dynamic memory limit Limits the maximum number of "opens" in the system Limits the maximum number of open inodes in memory Limits the maximum number of concurrent processes Sets the maximum number of pseudo ttys |
Kernel Parameters
-
In HP-UX 10.X, it is recommended this kernel parameter be set to 0. This will enable dynamic buffer cache.
-
- mkdir: cannot create /design/ram: Read-only file system.
-
- Memory fault(coredump)
-
NOTE: Never modify the kernel parameter, swchunk.
-
maxtsiz
Maxtsiz defines the maximum size of the text segment of a process. We recommend 1024 MB. -
- no more processes
We recommend maxuprc be set to 200.
- no more processes
-
Value Description -1 No limit, 100% of memory is available for IP packet reassembly. 0 netmemmax limit is 10% of real memory. >0 Specifies that X bytes of memory can be be used for IP packet reassembly.
The minimum is 200 Kb and the value is rounded up to the next multiple of pages
(4096 bytes). -
- file: table is full
-
NOTE: On a multi-processor system running HP-UX 10-20, ninode should NOT exceed 4000. This is due to a spinlock contention problem that is fixed in 11.0.
-
- at console window : proc: table is full
- at user shell window: no more processes
Kernel Parameter Recommendations
The following are the suggested kernel parameter values.
# Parameter | Value | |
bufpages | 0 | # on HP-UX 10.X |
create_fastlinks | 1 | |
dbc_max_pct | 25 | |
fs_async | 1 | |
maxdsiz | 2063806464 | |
maxfiles | 200 | |
maxfiles_lim | 2048 | |
maxssiz | (80*1024*1024) | |
maxswapchunks | 4096 | |
maxtsiz | (1024*1024*1024) | |
maxuprc | 200 | |
maxusers | 124 | |
netmemmax | 0 | # on desktop systems |
-1 | # on data servers | |
nfile | 2800 | |
ninode | 15000 | # 4000 on HP-UX 10.20 multi-processor systems |
nproc | 1024 | |
npty | 512 |
Networks
-
NFS
- NFS configuration. On NFS servers, a good first order approximation is to run two nfsd processes per physical disk (spindle). The default is four total, which is certainly not enough on a server. On 9.x systems, too many nfsd processes can cause context switching bottlenecks, because all the nfsds are awakened any time a request comes in. On 10.x systems, this is not the case and you can safely have extra nfsd processes. Start with 30 or 40 nfsd's. On NFS clients run sixteen biod processes. In general, HP-UX 10.X has much better NFS performance than previous versions of HP-UX.
- Patches - Always install the latest HP-UX NFS patch. HP periodically releases patches that correct problems associated with NFS, many of them performance related. If you are using NFS, you should make sure the latest patch is installed on both the client and server. See the PATCHES section for more details. General HP-UX patch information can be found at http://us-support.external.hp.com for the Americas and Asia. For europe and others, use http://europe-support.external.hp.com/.
- Local vs. Remote. You will need to determine which things you will NFS mount, and which should be local. For performance, it would be great if nothing was accessed over the network. And "if wishes were horses, dreamers would ride", said John Butcher. Consider doing some of these things and using some of the techniques described near the end of this document, under "NFS".
- Subnetting. In general, it is a bad idea to have too many systems on a single wire. Implementation of a switched ethernet configuration with a multi host server or a server backbone configuration can preserve existing wiring while maximizing performance. If you are doing rewiring, seriously consider using fiber for future upgradability.
- Local paging. When applications are located remotely, set the "sticky bit" on the applications binaries, using the chmod +t command. This tells the system to page the text to the local disk. Otherwise, it is "retrieved" across the network. Of course, this would only apply when there is actual paging occurring. More recently, there is a kernel parameter, page_text_to_local, which when set to 1, will tell the kernel to page all NFS executable text pages to local swap space.
- File locking. Make sure the revisions of statd and lockd throughout the network are compatible; if they are out of sync, it can cause mysterious file locking errors. This particularly affects user mail files and Korn shell history files.
- Design the lan configuration to minimize inter segment traffic. To accomplish this you will have to ensure that heavily used network services (NFS, licensing, etc.) are available on the same local segment as the clients being served. Avoid heavy cross segment automounting.
- Maximize the usage of the automounter. It allows you to centralize administration of the network and also allows greater flexibility in configuring the network. Avoid the use of specific machine names which may change over time in your mount scheme; force mount points that make sense. /net ties you to a particular server, which may change over time.
- You can watch the network performance with Glance, the netstat command, and the nfstat command. There are other tools like NetMetrix or a LAN analyzer to watch lan performance. Additionally, you can use PerfView and MeasureWare/UX to collect data over time and analyze it. You may want to tune the timeo and retrans variables. For HP systems, small numbers 4 for retrans and 7 for timeo are good. The default values for wsize and rsize, 8K, are almost always appropriate. Do NOT use 1024 unless talking to an Apollo system running NFS 2.3 on SR10.3. 8K is appropriate for 10.4 Apollos running NFS 4.1.
- Investigate the use of dedicated servers for computing, file serving, and licensing. A good scenario has a group of dedicated servers connected with a fast "server backbone", which is then connected to an ethernet switch, which is itself connected to the desktop systems.
Make sure that ninode is at least 15000 on HP-UX 10.X. Remember to not go above 4000 on an HP-UX 10.20 system with multi-processors, as previously stated. Some customers have seen performance degradation on Multi Processor systems when ninode is greater 4000. Check it on your system. The details of this problem are much too detailed and complicated for this document.
NFS file systems should be exported with the async option in /etc/exports.
Some items that can be investigated...
nfsd invocations
- nfsstat -s
UDP buffer size
- netstat -an | grep -e Proto -e 2049
How often the UDP buffer overflows / UDP Socket buffer overflows
- netstat -s | grep overflow
NFS timeouts...are they a result of packet loss? Do they correlate to errors reported by the links? Use lanadmin() or netstat -i to check this.
IP fragment reassembly timeouts?
- netstat -p ip
mounting through routers?
- check to see if routers are dropping packets
check for transport bad checksums
- netstat -s
is server dropping requests as duplicates?
- nfsstat
is client getting duplicate replies? (badxid)
- nfsstat on CLIENT
Some customers have mentioned that they have had serious problems because of too many levels of hierarchy within the netgroup file. It seems that this file is re-read many times, and the more hierarchy, the longer it takes to read.
Patches
- Always load the latest kernel "megapatch", ARPA transport patch, NFS/automounter patches, statd/lockd patches, and SCSI patch. Many performance and reliability improvements are delivered by these patches.
- Load the latest C compiler and linker patches.
- Load HP-VUE or CDE, and X/Motif patches at your discretion. These usually contain bug fixes.
- Load the latest X server patches.
Performance and the PATH Variable
HP-UX 11.0
-
Points of Interest
This applies to BOTH 10.X and 11.X versions of HP-UX.
-
lotsfree, desfree and minfree
on a system with up to 2GB of memory:
- lotsfree no larger that 8192 (32MB)
- desfree no larger than 1024 (4MB)
- minfree no larger than 256 (1MB)
on a system with 2GB to 8GB of memory:
- lotsfree no larger than 16384 (64MB)
- desfree no larger than 3072 (12MB)
- minfree no larger than 1280 (5MB)
on a system with a whole group of memory:
-
Text, Data and Shared Objects Maximum Values
-
EXEC_MAGIC
-
The New Parameters for Text, Data and Stack
-
Variable Size Pages
The chatr() command
#define EXEC_MAGIC 0x107 /* normal executable */
Whew!...had enough? Get it? OK. Let's move on.
The Benefits
There are NO hardware walkers. There ia a large penalty (cycle time) to perform tlb miss handling. Applications with large data sets will spend a lot of time handling tlb misses.- a larger piece of virtual space can be mapped with a single tlb entry
- large reference sets can be mapped with fewer tlb entries
- fewer entries result in fewer tlb misses.
How Do I Use Variable Sized Pages?
Valid page sizes range from 4K to 256MB. Take care when using the -L option (Large). This option requires that the user executing the application posess MLOCK privelege (see setprivgrp(1m)).New Kernel Parameters
There are three new kernel parameters, tunable by you. They are:
Variable Page Size "Inhibitors"
- physical memory size
- dynamic buffer cache
- start of virtual alignment
- re-running a recently chatr()'d application
- page demotion
- physical size inflation (performance degradation)
-
Memory Windows
Why Use Memory Windows?
Disadvantages
When Should Memory Windows Be Used?
Memory Window Usage
*LARGE* Memory Windows
-
Spinlock Pool Parameters
This was "borrowed" from the web page on Configurable Kernel Parameters.
The following parameters, all related to spinlock pools for multi-processor computers, are used similarly and are documented together here. Each parameter allocates the specified number of spinlocks for the corresponding system resource:
bufcache_hash_locks
- Buffer-cache spinlock pool
chanq_hash_locks
- Channel queue spinlock pool
ftable_hash_locks
- File-table spinlock pool
io_ports_hash_locks
- I/O-port spinlock pool
pfdat_hash_locks
- Pfdat spinlock pool
region_hash_locks
- Process-region spinlock pool
sysv_hash_locks
- System V Inter-process-communication spinlock pool
vnode_cd_hash_locks
- Vnode clean/dirty spinlock pool
vnode_hash_locks
- Vnode spinlock pool
These parameters are for use by advanced users only who have a thorough understanding of how spinlocks are used by multiple processors and how the number of spinlocks needed are related to system size and complexity. Do not change these from their default value unless you understand the consequences of any changes. In general, these values should not be altered without the advice of HP support engineers who are thoroughly familiar with their use.
Setting these parameters to inappropriate values can result in severe performance problems in multi-processor systems.
Following is a list of acceptable values. All of these parameters have the same minimum and maximum values. Only the defaults are different as indicated:
Minimum
- 64
Maximum
- 4096
Default
- 64 (ftable_hash_locks, io_ports_hash_locks)
Default
- 128 (bufcache_hash_locks, pfdat_hash_locks, region_hash_locks, sysv_hash_locks, vnode_hash_locks, vnode_cd_hash_locks)
Default
- 256 (chanq_hash_locks)
Specify a value that is an integer exponent of 2. If you specify any other value, SAM or the kernel itself changes the parameter value to the next larger integer exponent of two (for example, specifying 100 results in the spinlock-pool value of 128.
Description
In simple terms, spinlocks are a mechanism used in multiple-processor systems to control the interaction of processors that must be held off while waiting for another processor to finish a task so the results can be passed to the waiting processor. Spinlocks control access to file-system vnodes, I/O ports, buffer cache, and various other resources.
Earlier HP-UX versions allocated a fixed number of spinlocks for all resources, but beginning at HP-UX 11.0, spinlocks can be allocated for each resource type to accommodate very large and complex systems.
In general, if the system is encountering lock contention problems that are associated with one of these hashed pools, first identify the resource spinlock pool that is associated with the contention, then increase the spinlock-pool parameter for that resource. Spinlock pools are always an integer power of two. If you specify a value that is not, the kernel always allocates a value that is the next larger integer exponent of two.
As stated above, these parameters are for use by experienced, knowledgeable system administrators only. They should not be altered unless you are quite certain that what you are doing is the correct thing to do.
(c) Copyright 1998 Hewlett-Packard Company.