martes, 11 de octubre de 2016

Identifying memory pressure - Part 4

Continue from part 3...

MEMORY AVAILABLE
At this point, I want to introduce a new parameter: MemAvailable.
This parameter is available is included in /proc/meminfo since version 3.14. In Red Hat, can be backported to previous versions from kernel-2.6.32-504.el6. Just need to set the value of sysctl parameter vm.meminfo_legacy_layout to 0. Then check at /proc/meminfo the new parameter MemAvailable.
For previous versions can be calculated:
awk -v low=$(grep low /proc/zoneinfo | awk '{k+=$2}END{print k}') \
 '{a[$1]=$2}
  END{
   print a["MemFree:"]+a["Active(file):"]+a["Inactive(file):"]+a["SReclaimable:"]-(12*low);
  }' /proc/meminfo

This calculation has been created from the /proc/meminfo.c (https://lkml.org/lkml/2013/11/5/450)
What I have observed: Once the system has already swapped, swap can be allocated, but freeable. Also, under memory pressure, the system liberates memory,so this value can be relatively big at one point in time, but the system can be under pressure. This value can be reliable when the system is not under pressure.
That's another reason to keep an eye in more parameters, as previously said.

LET'S PUT SOME ORDER ON THIS STUFF
OK, so, to check the system and do some forecasting, if needed, will assume some points.
1. We'd like our processes and SGA to be in memory. We do not want our system to swap.
2. No more of 80% memory of the system should be for Oracle.
So, let's go:
1. In a moment in time, unless there already is a memory issue, we cannot know certainly if the system's memory can lead to problems, or to estimate how much should we add.  To estimate the memory use of a system and check whether is too adjusted or is not, we must observe the behaviour along a workload period.

2. We should monitor those parameters:
1.       pgpgin/s pgpgout/s pgscank/s  majflt/s %vmeff (from sar -B)
2.       Free, BufFree, %SwapFree  %MemFree  MemAvailable (from script supplied)
3.       Committed_AS
   Committed_AS:  Is the amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which has been allocated by processes, even if it has not been "used" by them as of yet.
  That means that under the current workload, this is the amount of memory that the system would need in a worst case scenary.
  majflt/s: Represents the pages that had to be loaded from disk (synonymous with I/O overhead)
   pgscank/s:  Represents kswapd activity.
   %vmeff : is a metric of the efficiency of page reclaim. If it is near 100% then almost every page coming off the tail of the inactive list is being reaped.   If it gets too low (e.g. less than 30%) then the virtual memory is having some difficulty. This field is displayed as zero if no pages have been scanned during the interval of time
   The other fields are quite self-explanatory.

3. We must observe if  the behaviour of the system is right:
   pgpgin/s pgpgout/s should have not very high values both of them. If both of them have very large values, could mean that the system have that the system is needing new RAM as fast as it can swap out application data.    This often means that the application needing the RAM has forced out all of the truly old data and has started to force data in active use out to swap.    Those "active use data" will be immediately read back in from swap, causing both "swap in" and "swap out" to be elevated, and roughly equal.   This is known as "thrashing" and is an undesirable condition.
  pgscank/s  majflt/s. Those indicators, together with  %vmeff are the best indicators of swap activity. If they appear continuously, means that the system has to use the swap too often.   can be a symptom of memory pressure.
  Free, BufFree, %SwapFree  %MemFree  MemAvailable (if %SwapFree  %MemFree are very low, that indicates memory pressure.
   If the behaviour of the system shows some evidence of memory pressure, we'll then should try to lower the memory requirements of the system. (take a look at the parameters  of the instance, SGA, open cursors, PGA, etc) and try to reduce them if they are oversized. Check that the size of the swap is right and consider to add memory if all fails.

  If the behaviour is right, then we could try to use what I previously commented about how to estimate the size of the processes together with the MemAvailable parameter to  do some forecasting about how many processes/instances/both could we add.

And that's all by now, Hope you have enjoyed those articles and find them useful!

No hay comentarios: