A good place to begin consolidating information specific to Linux for performance tuning on the Power platform. There's plenty of general information available on the web and in performance tuning books, so here we can begin adding pieces which aren't typically or easily found.
Table of Contents
Performance Technologies for Linux
Several areas which can be addressed include
- The SystemTap
community project and a Linux Kernel Event Trace (LKET) tool based on SystemTap
- The libhugetlbfs
sourceforge community project which supports the ability to load portions of an executable in 16MB large pages on Power systems. The project also allows malloc's to be backed with 16MB large pages on Power systems.
Transparent large pages with libhugetlbfs
A wiki page has been added which describes a working example of tuning stream with libhugetlbfs
. The example highlights several common gotcha's seen when working to get basic performance improvements and better performance consistency of a simple memory intensive workload. The example uses IBM compilers, OpenMP binding of threads to processors, and leveraging the new libhugetlbfs to load either the .bss segment, or malloc's, to 16MB large pages.
Taking advantage of oprofile
Memory usage - comparing AIX and Linux
Helpful Books on Linux Tuning
- Linux Debugging and Performance Tuning - Tips and Techniques - Steve Best
- System Performance Tuning - Gian-Paolo D. Musumeci and Mike Loukides
- Optimizing Linux Performance - A Hands-On guide to Linux Performance Tools - Phillip G. Ezolt
Each of these provides good insights into the tools and techniques available for Linux. Linux has matured nicely with all of the basics available for performance problem determination.
Compilers
- IBM compilers will generally show better performance than gcc compilers.
- New IBM compiler versions for Linux on Power stay in sync with most AIX updates
- In our experience, SPECcpu2000 and SPEComp2001 application benchmarks runs show little to no differences between SLES 9 and Redhat 4
- Components within each of the benchmarks can show specific differences which can be balanced out in the geometric mean used as the reporting number
- These programs are not sensitive to kernels and are generally considered smaller domain problems sets which are not representative of large complex enterprise applications
- IBM compilers have hardware specific enhancements
- If compiled for hardware target and maximum optimization, good performance gains are achieved
- IBM compilers specifically support GPUL and Power 5 hardware
- IBM compilers support program directed feedback (PDF)
- New versions
- VisualAge Version 8 and Fortran Version 10.1 have many incremental fixes and improvements, many of which show improved performance
- The IBM compilers are recommended for all benchmarking and high performance work
- These new versions include updates for the OpenMP processing directives which can also result in improved performance
- XL C/C++ Advanced Edition for Linux
- XL Fortran Advanced Edition for Linux
- Plans are underway to enhance the performance of gcc compilers which continue to improve
- Updates to gcc compilers will be available in distro updates
Related products
- ESSL for Linux on POWER V4.2.3 is available and often used in specialized HPC workloads where engineering and scientific performance improvements are needed.
- FDPR/Pro - Feedback Directed Program Restructuring (FDPR-Pro) is also available for performance improvements of executables. More information is available at this web site

Basic tuning tips for Linux
 | Remainder of the page
The remainder of the page needs to be reviewed and updated for currency with the latest support available on SLES 9 SP3 and RHEL 4 U3. It will be easier to move the content below to "child" pages from this main page. |
Linux sports a dynamic kernel that allows you to modify and change many of the parameters without rebooting the system. Most of these parameters are located n the /proc file systems. There are two ways to change it, by using cat and view, and by using the command sysctl. We describe these methods in more detail here.
You can use cat to view, and use echo to change the variable.
As an example, to disable something like IP forwarding:
Or, the command sysctl provides a simple interface to all the tunable parameters in the /proc file systems.
Because most of the default kernel parameters for system performance are geared toward workstation workload rather than file server or large computation workload, there are tuning parameters that you can use to tune Linux for better performance. Following are some of the sample tuning parameters that can be applied into Linux. For a complete listing, use sysctl or the YaST2 utility.
File system tuning
In Linux, the bdflush file in the /proc directory governs when the bdflush should be activated to clear cache dirty pages.
For heavy I/O in a file server and Web server environment, you can also disable the atime. atime basically stores the information of when is the last time the file has been accessed. You can update your /etc/fstab to reflect this.
Rather than changing the whole file system to reflect this change, use the command chattr to tag or mark individual files that do not need to store the access time.
Network tuning
To reduce the amount of work done at the TCP stack to check on every packet, you can basically disable by echo "0" to the file:
Or you can use the command "sysctl -w <kernel_parameter="choice">".
Often we see clients that do not properly close the TCP connections and so the server keeps the connections open, and may wind up with a large number of open connections. The Linux TCP stack will probe the TCP connections after a given amount of time of inactivity (by default, it is two hours). You can change this wait time as follows:
If needed, you can also change the tcp_keepalive_intvl and tcp_keepalive_probes to determine the length of time to wait before the next probe occurs.
Powertweak
The Powertweak tool from SuSE, shown in Figure 4-14, allows you to tweak many of the system parameters, including those mentioned. After you click the option you wish to tweak, the right-hand panel will explain what the option is for. Most tweaking takes effect immediately.
Tip: Any tuning done using the command sysctl is only good for the session, so if you want the changes to be permanent, create a file called /etc/sysctl.conf and put in the tuning parameters. The file will be read each time Linux boots up.
Tuning Tips
I/O Subsystem
Selectable I/O Schedulers (2.6 feature only)
- workload & FS dependent
- CPU bound workload - use noop I/O Scheduler
- non-CPU bound workload - CFQ and deadline
- JFS works better on CPU bound workload
SLES 9 default is CFQ
- good choice for most workloads
Filesystem Performance
- on large systems ext3 is not the best choice
- jfs and xfs are better on large systems from a performance standpoint
- xfs has edge on very large files
- jfs is better on smaller files
Sequential Read Tuning
- Increase max_readahead size using htparm tool
- Read ahead is a function of pagecache
I/O Scheduler Tuning
- Increase nr_requests to 1024 (improves on most I/O workloads)
NFS Tuning
- bump up NFS daemons in large NFS server
- larger MTU (9000 bytes on gigabit Ethernet)
- Use NFS over TCP and not UDP on Linux
Network
Gigabit Ethernet NIC
- Large (9000 bytes) MTU
- Full duplex
- TSO - TCP Segment offload
- Not turned "ON" in SLES 9
- interrupt throttling rate
SSL Performance use xcrypto card
Linux System Tuning
TCP/IP Tuning
- Lots of info available on web
Database
- Use Asynchronous I/O for database page cleaners
- Raw devices (raw I/O) provide performance superior to filesystems
- Using disk controllers that provide write caching can provide significant performance improvements, particularly for database logs in an OLTP environment.
- Be sure to consult Linux sysctl tuning as per database vendor recommendations
- The deadline I/O scheduler has proven to be best for both TPC-C and TPC-H workloads
Java
- Can use either 32-bit and 64-bit IBM JVM 1.4.2
- The JVM can exploit large page support provided in the 2.6 kernel
- Enable large page support using -Xlp for the Java heap
- Can improve performance between 6-15%
- Increase the available virtual memory
- Set /proc/<pid>/mapped_base to 0x10000000 (default is 0x40000000)
- Adds approximately three more 256MB segments to the JVM - allows 3.2 GB heap
- Use 32-bit JVM for smaller systems (up to 1-way to 8-way)
- 32-bit JVM can give 10% boost in workloads like SPECjbb
- Consider using 64-bit JVM for larger systems (over 8-way systems)
- For 16-way and greater, the 32-bit JVM has scaling limits which will offset the 10% speed boost
Web Servers
- TUX and Zeus perform better than Apache in general
- Apache is most widely used web servers because of its rich feature set
- Benchmarks: SPECweb99, SPECwebSSL, SPECweb2005
- Performance tips
- Tuning of the web servers
- Need large memory for file cache and sendfile()
- Need a lot of file handles, e.g., 300,000
- Tuning of the network stack
- Sizes of RxBuffer and TxBuffer
- Tune the number of buffers in the hot list
- Make the file depth shallow for better directory caching
- Take advantage of the network card capability
- TSO (TCP Segmentation Offload)
- Rx and Tx interrupt delay