jueves, 6 de octubre de 2016

Identifying memory pressure - Part 1

ALL WHAT COMES NEXT HAVE BEEN TESTED ON RED HAT LINUX V 6. MOST OF IT SHOULD BE APPLIABLE TO ANOTHER UNIX OR LINUX SYSTEMS
THINGS CAN CHANGE DEPENDING ON MANY FACTORS (NUMA, Huge Pages, etc), BUT WHAT COMES NEXT SGOULD BE EASILY ADAPTABLE.

In the beginning, my idea was to check the amount of physical memory used by Oracle processesin order to do some forecasting and capacity planning.
Unfortunately, I've seen that is very difficult to calculate the exact amount of memory used by Oracle due to many reasons:

1. Virtual Memory Management (VMM). Unix and another systems set up a management layer between the physical memory  and the applications. This layer provides a sequencial access to memory addresses (on this layer) that are  mapped to physical ones. This VMM permits the processes to map memory that is not actually allocated.

2. Resident Set. VMM makes use of the locality of reference characteristic of processes. That means that in a given time interval, all the instructions executed by a process will tend to be close together in terms of  their locations in memory. Same happens with data. So, that means that in one moment in time, a process needs to access only to a small subset of memory that is actually allocated to it. that memory which contains the currently executing code and the data being accessed by that code. This set of memory (pages) is known as the process’ working set or resident set. The size of this Resident Set can (and actually does!) change over time.

3. Overcommitting of memory. One of the principles of virtual memory systems is that they allow the available physical memory to be oversubscribed. That is, they allow processes to allocate more memory than actually exists. Point 2 illustrates how.
4. The processes can have the memory addresses reserved in the VMM, but not actually allocated in memory until the process needs them.

Of course, more reasons make the task of calculating the use of memory very difficult (kernel memory, caches L1 and L2, SLAB Allocator, etc). So, I decided to try to find a practical approach to:

1. Estimate roughly the memory used by a Oracle process to be able to provide a forecast if needed.
2. Find ways to monitor memory to be able to notice when there is a memory shortage.




HOW VIRTUAL MEMORY WORKS?

First, I will provide a quick glance to how the memory system works. A more detailed explanation (and a wonderful resume!) is given in note 558237.1 of Oracle Support. The next lines are a short resume of this note, enough to understand how it works.

To explain all the details about how memory works is out of the scope of this article:
Virtual memory systems present every process a linear, contiguous address space. This property of providing every process with a contiguous, linear address space, makes life much easier for developers, support personnel etc. while at the same time enables efficient management of the available physical memory resources, but, physical memory that back some or all those virtual memory pages are not located in any kind of linear or contiguous arrangement within the system’s physical memory space. The physical pages can be dotted all around the physical address space in any order.

The VMM system maintain a mapping between virtual memory pages (e.g. that make up a processes’ address space) and the physical memory pages that they currently occupy. One of the most important structures used for this is called the page table.

The funny thing is that at any moment in time, most of the virtual pages of the process address space does not occupy a page of physical memory.Only those pages in the working set are typically mapped to actual physical memory. The rest of the pages in the processes address space exist only on disk (in swap or in the process binary file).

When an instruction located on a page in the working set gets executed and references an address within the process addresses space that is not in the current working set (Resident Set) and and so not currently mapped to a page of physical memory, the memory manager (the hardware layer) detects this situation and raises a special hardware interrupt (Page fault) . Then, the VMM examines the page table plus other information and :

1.       Locates the contents of the required page on disk (binary image file or swap area) .
2.       Allocate a free page of physical memory for the page .
3.       Read the page off disk into the newly allocated physical memory page .
4.       Update its control information and the page table entry for the page to show that the page is now mapped to that specific real page.
5.       Instruct the CPU to re-start execution of the instruction that caused the page fault
This is known as a Page In.
Of course, Physical Memory is limited and it is possible that there is not physical memory available to allocate a free page, then:

1.       Using various algorithms, determine a page of physical memory that can be reclaimed such that there will likely be a minimal impact on the process that owns it (i.e. a page that has not been touched for a long time).
2.       If the page has not been modified since it was last paged in, goto step 3. Otherwise, allocate a page of disk space in the swap area and write the contents of the page to it.
3.       Mark the page in the current owning process page table as not mapped and include the disk address of the disk swap page holding its contents.
4.       Allocate the memory page to the new process and update its page table entry accordingly.
This is a page out operation.

This is a very short resume of the way Virtual Memory works. In the referenced note there is a longer explanation on the subject (still a resume, actually!).

There are two more concepts that we need to keep in mind when dealing with Oracle, they are Shared Libraries and Shared Memory:

Shared Libraries are the libraries that can be linked to any program at run-time. They provide a means to use code that can be loaded anywhere in the memory. Once loaded, the shared library code can be used by any number of programs.

So, this way, the size of programs(using shared library) and the memory footprint can be kept low as a lot of code is kept common in form of a shared library.

Basically, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies.

The key point to remember here is that both of them are shared, that is, they are loaded in memory once and used by different processes.

A very good source of info about memory is Gorman's book, in kernel.org. This book treats deeply all the details about memory in Unix:

https://www.kernel.org/doc/gorman/

No hay comentarios: