Detailed content of the training
General
- Plan of action for performance problems:
measure, understand and improve.
- Potential bottle necks and rules of thumb for measuring.
- The
/proc
and /sys
file systems.
- Reading statistical counters from the kernel. Categories of counters.
Misleading counters.
- Monitoring and modifying kernel settings.
- Process management:
Programs and processes.
Creation and transformation of processes. Zombie processes.
Life cycle of a process. Role of the kernel in a Linux system.
Sleep and wakeup mechanism. Processes and threads.
Analysis of blocked threads. Interrupt handling and stolen time.
- Overview of commonly used measurement tools.
General use of the measuring tools
atop
and atopsar
.
- Control groups (cgroups) in general: limiting the use of hardware components.
Processor
- Hardware overview:
Multi socket, multi core, hyperthreading, NUMA.
Instruction execution, variable clock speed (frequency scaling,
cache sharing, etcetera), instructions per clock cycle.
- Thread scheduling by the kernel:
Time sharing and real time processes. Thread states.
Thread scheduling policies, influence of nice value.
Implementation of the
the CFS scheduler (classes deadline, realtime and fair).
Task groups and autogroup. Scheduling domains.
- Measuring with
atop
, atopsar
,
ltrace
, strace
, perf
and many other tools.
- Improving with
nice
, chrt
,
taskset
and numactl
(NUMA) commands,
and with the cgroups controllers 'cpu' and 'cpuset'.
Memory
- Hardware overview:
Memory Management Unit (MMU),
virtual versus physical addressing, Translation Lookaside Buffer (TLB),
page directories and page tables.
- Memory management by the kernel:
Demand paging, process sizes (virtual, resident and proportional), memory layout,
swap space, memory leakage, overcommit, OOM killing, huge pages,
samepage merging, nodes and zones, swapout mechanism, NUMA memory policies.
Influence of kernel parameters.
- Measuring with
atop
, atopsar
and many other tools.
- NUMA improvement with
migratepages
and numactl
commands.
Limiting memory with help of the cgroups controller 'memory'.
Disk storage
- Hardware overview:
Characteristics of HDD (e.g. zones) and SSD
(wear leveling, page allocation, trim), thin provisioning,
disk controllers.
- Disk handling by the kernel:
Queueing algorithms and I/O scheduling (CFQ, etc), block layer,
RAID techniques, Logical Volume Management (LVM),
characteristics of ext4 and XFS filesystems, filesystem allocation strategy,
pathname caching, behavior of page cache and cache flushing. Journaling.
Influence of kernel parameters.
- Measuring with
atop
, atopsar
,
countcat
and many other tools.
- Improving with
ionice
, defragmentation tools,
fstrim
(SSD) and
with help of the cgroups controller 'blkio'.
Networking
- Hardware overview:
Interface controllers, switches and routers.
- Networking by the kernel:
TCP/IP protocol and Bandwidth Delay Product (BDP), network delays.
- Measuring with
atop
, atopsar
,
netstat
, traceroute
, iperf
, attract
,
iftop
, SNMP and many other tools.
Use of the kernel module netatop
.
- Improve by partitioning network bandwidth with help of the
cgroups controller 'net_cls'.