So I was reading about paging in Operative Systems.One of the biggest pros for using paging as a memory management method(that I've come across) is that it solves the external fragmentation problem(in both operative memory and storage) and allows processes to be allocated in operative memory in a non-continual way.However to implement paging, we would need to keep up and search through page table which could have a large number of entries(millions in some cases). And I imagine there is a big overhead in doing so(both time and space wise).

The thing that I don't understand is why can't we just divide a program into an arbitrary number of segments each time we load it into operative memory.We could divide it in such a way that each segments "fills a hole" in operative memory if needed and therefore solve the problem of external fragmentation.Obviously the program could be loaded in a non-continual way and we would only need to store 2 addresses per segment(upper and lower bound) and maybe some table of segments to keep up the order.

To quote the book I'm reading(OS concepts - Abraham Silberschatz, Peter Baer Galvin, Greg Gagne, 9th edition): "Because of its advantages over earlier methods, paging in its various forms is used in most operating systems, from those for mainframes through those for smartphones".

Am I missing something here? How does using paging justify its overhead? Do we really need to keep track of each page? Do some other things go into consideration when choosing the right method used for Memory Management?

2

Best Answer


Paging is the basis for various "virtual memory tricks", like:

  • instead of having to wait while a file is loaded before it can be used; mark the pages as "belonging to a memory mapped file" and then pretend that the file was loaded (and fetch the pages from disk when they're actually needed, possibly with some pre-fetching strategy happening in the background). This reduces RAM consumption and can speed up things like executable start-up times.

  • instead of allocating actual physical RAM during things like process start-up (for its ".bss section"/uninitialized data, "not used yet" heap space, etc), and instead of literally copying all pages during fork(); use "copy on write" tricks so that you only need to create/copy the data when it's actually modified. This reduces RAM consumption and speeds everything up (especially when a lot of the memory is never modified).

  • if there isn't enough RAM, then you can just send pages to swap space (e.g. disk) and keep everything running. This is much faster than waiting for something to happen that will never happen (because the process crashed due to not having enough memory).

  • all of the things above that reduce the amount of actual RAM being used mean that you can use that RAM for file system caches instead. Significantly larger file system caches can mean significantly faster file IO (less "file cache miss", less slow disk IO). Of course if data is in file cache you can map the same pages as "copy on write" into multiple different processes too (without copying the data or using more RAM).

For overhead; CPU manufacturers put a lot of effort into minimizing the overhead (using "translation look-aside buffers", pre-fetching strategies, out-of-order execution to keep CPU busy while it waits for a translation to happen, etc). In the same way, operating system developers also try to minimize the overhead (e.g. reduce the number of task switches between processes, trying to keep processes on the same core, etc). This means that the overhead is fairly small compared to the many large benefits.

Of course without paging you end up having to deal with external fragmentation, which typically devolves into wasting lots of CPU time copying large amounts of data from one piece of RAM to another to "de-fragment" RAM. On top of that, you'd need something else to ensure that different processes are isolated and that any permissions (like "this area is not executable") are/can be enforced; and this "something else" will probably add as much overhead as paging all by itself (unless you want a massive security disaster). With this in mind, even if you ignore the benefits of paging, paging is still likely to be less overhead than not using paging.

The thing that I don't understand is why can't we just divide a program into an arbitrary number of segments each time we load it into operative memory.

Dealing with different segment sizes would be painful (less "shift and mask to find the right index in a table" and more "do a linear search through all segments to find the right segment"). If you use fixed sizes segments it would be much faster; but that's what paging is (a "page" is your "fixed size segment").

  1. Dividing a program into arbitrary segments.

They can't quite be arbitrary; for example I might like to make a very large vector for some application:

#define N 10000000int v[N];....for (i = 0; i < N; i++) {v[i] = ...}

The compiler really wants v to appear to occupy successive memory locations. So, your segmenter would need to be aware of these items; but it gets worse:

int *v;v = malloc(sizeof(*v) * N);for (i = 0; i < N; i++) {v[i] = ...;}
  1. Justifying overhead:

Now you need to find a large chunk of physically contiguous RAM at runtime, and since you have no relocation mechanism, you have no way to move around previously allocated chunks. This is the fragmentation problem; and without a page-style mmu, it is very difficult to solve.

You can solve it by turning your compiled languages into pseudo-interpreted languages; but what has more overhead: compiling:

a = v[i];into:ld R0,R1+R2*4 ; + overhead of mmu.or:mov R0, R1+R2*4call findsegld R0, R0

In the general case, the overhead, in terms of RAM, is in the order of 0.1%; for a concrete example, 10 bytes for a 4k page on an ARM64 or AMD64 architecture.libc.so, on my linux system, is about 2M of text + 40k data; most programs only use a very small amount of this. Thanks to paging, only the bits of libc that are used need to occupy memory. On a 64G system with 32 processes, the libc savings alone swamps the page table overhead.

3: Keeping track.There are a number of avenues to attack this from. One is multiple page sizes, which are supported on most architectures. The other is that the OS doesn't have to provide the granularity of the MMU. For instance, on a system with 4k pages, it could insist on only surrendering memory in 64K chunks. Thus reducing the management overhead by a factor of 16, while modestly increasing the granularity wastage.