After this documentation was released in July 2003, I was approached by Prentice Hall and asked to write a book on the Linux VM under the Bruce Peren's Open Book Series.

The book is available and called simply "Understanding The Linux Virtual Memory Manager". There is a lot of additional material in the book that is not available here, including details on later 2.4 kernels, introductions to 2.6, a whole new chapter on the shared memory filesystem, coverage of TLB management, a lot more code commentary, countless other additions and clarifications and a CD with lots of cool stuff on it. This material (although now dated and lacking in comparison to the book) will remain available although I obviously encourge you to buy the book from your favourite book store :-) . As the book is under the Bruce Perens Open Book Series, it will be available 90 days after appearing on the book shelves which means it is not available right now. When it is available, it will be downloadable from http://www.phptr.com/perens so check there for more information.

To be fully clear, this webpage is not the actual book.
next up previous contents index
Next: Acknowledgments Up: List of Tables Previous: List of Tables   Contents   Index

Abstract

Linux is developed with a strong practical emphasis more than a theoretical one. When new algorithms are suggested or existing implementations questioned, it is common to request code to match the argument. Many of the algorithms used in the Virtual Memory (VM) system were designed by theorists but the implementations have now diverged from the theory considerably. In part, Linux does follow the traditional development cycle of design to implementation but it is more common for changes to be made in reaction to how the system behaved in the ``real-world'' and intuitive decisions by developers. These intuitive changes can be a hindrance as they are rarely backed by controlled, repeatable experiments. Consequently, some design choices have been made without a strong foundation.

This has led to a situation where the VM is poorly documented except for a small number of web sites with incomplete coverage. The existing books on Linux are comprehensive but they try to cover the entire kernel and sometimes leave out the details of the VM. This leads to the VM being fully understood by only a small number of core developers. Developers looking for information on how it functions are generally told to read the source and little or no information is available on the theoretical basis for the implementation. This requires that even a casual observer invest a large amount of time to read the code and study the field of Memory Management. The problem is further compounded by the fact that the code comments, if they even exist, only indicate what is happening in a very small instance. This makes difficult to see how the overall system functions as is roughly analogous to using a microscope to identify a piece of furniture.

As Linux gains in popularity, in the business as well as the academic world, more developers are expressing an interest in developing Linux to suit their needs and the lack of detailed documentation is a significant barrier to entry for a new developer or researcher who wishes to study the VM.

The objective of this thesis is to document fully how the 2.4.20 VM works including its structure, the algorithms used, the implementations thereof and the Linux-specific features. Combined with the companion document ``Code Commentary on the Linux Virtual Memory Manager'' the documents act as a detailed tour of the code explaining almost line by line how the VM operates and where applicable, explains the theoretical basis for the implementation. It will also describe how to approach reading through the kernel source including tools aimed at making the code easier to read, browse and understand.

It is envisioned that this will drastically reduce the amount of time a developer or researcher needs to invest to understand what is happening inside the Linux VM. This applies even if the VM of interest is a later version as the time needed to understand changes and extensions is considerably less than the time required to learn the fundamentals of the Linux VM.


next up previous contents index
Next: Acknowledgments Up: List of Tables Previous: List of Tables   Contents   Index
Mel 2004-02-15