mikeash.com: just this guy, you know?

Posted at 2009-06-19 13:48 | RSS feed (Full text feed) | Blog Index
Next article: Friday Q&A 2009-06-26: Type Qualifiers in C, Part 1
Previous article: Friday Q&A 2009-06-05: Introduction to Valgrind
Tags: fridayqna memory performance
Friday Q&A 2009-06-19: Mac OS X Process Memory Statistics
by Mike Ash  

Welcome back to another Friday Q&A. Now that WWDC is behind us, I'm back on track to bring you more juicy highly-technical goodness. Maybe I can even get back to doing one a week.... This week I'm going to take André Pang's suggestion of discussing process memory statistics (the stuff you see in Activity Monitor or top) in Mac OS X.

Memory Structure
Before I can discuss what the stats mean, I first have to discuss just how memory actually works on a modern operating system. If you already know the difference between physical memory and virtual address space, understand how file mapping works, etc., then feel free to skip ahead.

Hardware
At the hardware level, memory is physical chips accessed over a bus. Each byte of memory in those chips has a discrete physical address (although technically modern systems aren't usually byte-addressible, requiring larger chunks to be accessed).

Mediating access to the physical chips is the CPU's MMU (Memory Management Unit). The MMU is what allows for virtual memory. It maps between logical addresses coming from the CPU and physical addresses sitting out in physical RAM.

This gives the CPU a large virtual address space that doesn't necessarily correspond to the physical memory. (This space is 4GB in 32-bit, and a really big number in 64-bit.) Any given section of that address space can either be mapped to an arbitrary section of physical memory, or it can be left unmapped.

OS
What happens when a program tries to access memory that's unmapped? A hardware exception results, and the OS gets to take over.

A cleverly programmed OS (like, say, any halfway recent UNIX, or even Windows) can use this fact to do some interesting things. It could, say, maintain its own, more complicated mapping behind the scenes which says that a section of memory that's unmapped in hardware is actually mapped to a file on disk. Then when a hardware exception is raised for trying to access that section, the OS can read a chunk of the file into that spot and then let program execution continue. Now you have file mapping and (if you automatically unmap little-used sections of memory and write their contents out to disk) swap.

Another clever thing is to map sections of two different processes' address spaces to the same chunk of physical memory. Now you have shared memory!

These techniques can be combined. For example, shared frameworks are typically loaded by mapping them into memory (allowing the OS to load them off of disk lazily). And they're then mapped into multiple processes at once, allowing them to use the same physical RAM for all processes instead of having a bunch of copies.

Definitions
Now that we know roughly how the stuff works, let's define some memory-related terms:

And with that, we can now see what the various fields in top mean, from looking at the man page and using these definitions:

It should also be noted that these numbers are derived from an accounting system which does not always completely correspond to the true numbers, especially when distinguishing between shared and private memory. They're generally close enough to be useful, at least.

Interpretation
By this point you're probably scratching your head and wondering which number you should look at to see how much memory your program is using. Trouble is, there isn't one!

As you've seen, memory usage is highly complicated, and none of these numbers answers that question. In fact, with things like file mapping and shared memory, it's not even a question that really makes sense.

That's not to say that these numbers are useless, though. Even though nothing directly corresponds to what you'd really like to know, there are still some interesting facts you can obtain.

For 32-bit programs, VSIZE can be very important. This is because 32-bit programs have a hard 4GB limit on virtual address space, and in this modern world it's not all that hard to hit that limit. Once you do, memory allocations will begin to fail and your program will probably crash shortly afterwards. If your VSIZE is near the 4GB limit, you're chewing up too much address space on something.

(For 64-bit programs, the virtual address space is virtually unlimited, and so this column is of little use. For example, garbage collected apps in 64-bit immediately allocate a 64GB chunk of virtual address space just to make the accounting easier. This has no bearing on your actual memory usage and is completely harmless, although it tends to freak out users who go groveling around Activity Monitor.)

RPRVT can be useful as a rough indicator for watching if the total amount of memory your program has allocated is going up or down. This is dangerous to rely on, however. Because this only tracks resident memory, if your program has started to swap then your RPRVT will no longer increase, even though you're still allocating more and more memory. (To detect this, you can watch to see if VPRVT is going up, and the number of pageouts listed at the top of the screen is going up.) Conversely, the memory allocator doesn't always give memory back to the system right away, so this number may not go down if your program is freeing memory.

Overall, be careful not to rely too much on these statistics. For more precise information to track down leaks and excessive memory allocation, tools like the leaks command and the ObjectAlloc instrument are much better.

Conclusion
That brings us to the end of this edition of Friday Q&A. Now you should understand what all those weird numbers mean in top (except, potentially, for all of the ones that aren't related to memory) and how best to use and not use them.

Come back next week (I hope) for another exciting edition. Be sure to send along your ideas for topics to discuss. Without your contributions, Friday Q&A could not exist. Post them in the comments or e-mail them directly to me.

Friday Q&A would like to acknowledge Ed Wynne's important role in providing technical advice for this week's post.

Did you enjoy this article? I'm selling a whole book full of them. It's available for iBooks and Kindle, plus a direct download in PDF and ePub format. It's also available in paper for the old-fashioned. Click here for more information.

Comments:

Bryce at 2009-06-19 23:31:20:
Very clear and helpful. Thanks!

Nick at 2009-06-20 03:00:47:
A common question is "which apps are making my machine slow or filling up my RAM?" Of course, this is a difficult question, but I think that RPRVT is the right single number to look at.

Scott at 2009-06-20 04:05:40:
A common question is "which apps are making my machine slow or filling up my RAM?" Of course,

the answer is "Firefox"

:-)

Thanks for the information!

Marc at 2009-06-20 04:07:37:
Scott: nice one :-)

Jon at 2009-06-20 05:46:02:
How do these terms relate to the terms used in activity manager. Is private the same as wired?

ssp at 2009-06-20 05:52:22:
One of the nicest explanations of the mythical numbers I have seen, thanks.

Yet - and I have been wondering about this for years - my main question "which number corresponds to the number displayed by Mac OS 9" still remained unanswered, or rather answered as 'none'.

Which is a shame because the OS 9 numbers were actually useful in the sense that they told me which application would be worth quitting when running into swapping hell. In OS X I have to intuitively quit notorious memory hogs like iPhoto or VMWare but it seems to be impossible to collect reasonable information about which application is most worth quitting from these numbers.

Tom Vanderlinden at 2009-06-20 06:06:14:
Could you take one more step and translate the Activity Monitor terms used in the System Memory tab:
Wired
Active
Inactive
Used
Free
Vm size:
Page ins/outs

"Nice" in the the CPU tab is also a mystery,
but I suppose that is another subject.

Nick at 2009-06-20 06:45:52:
Tom Vanderlinden: Nice" in the the CPU tab is also a mystery

"nice" is scheduling priority, try 'man nice' in a terminal or http://en.wikipedia.org/wiki/Nice_(Unix)

natevw at 2009-06-20 06:54:36:
If system libraries are shared via shared memory, how is process-specific state kept separate?

Chuck at 2009-06-20 06:57:58:
I just can't see why, by your definition of RPRVT, RSHRD and RSIZE, RSIZE is not equal to RPRVT+RSHRD.
RSIZE measures actual memory, but you said RPRVT is the amount of address space, local to the process, which corresponds to items currently present in physical RAM.

I'm sure I'm just missing something

Miles at 2009-06-20 07:39:46:
Here's what I've always wondered: In Activity Monitor, why does the "Virtual Memory" number for a process drop by about 568MB for a process when you inspect it?

Matthew Kosterman at 2009-06-20 11:33:55:
I'd love to know how to figure out which process(es) is/are "stuck" when this is reported in TOP. It seems as though whenever my machine is behaving erratically, with inexplicable pauses in execution, I can run TOP and I will see one or more "stuck" processes in the header info.

I've searched all over and not come up with much info other than what I can summarize as: it doesn't necessarily mean something bad, it just means a program is waiting for something.

It would be nice to know which program and what its waiting for.....

Jean-Daniel Dupas at 2009-06-20 18:26:10:
If system libraries are shared via shared memory, how is process-specific state kept separate?


Shared library are composed of multiple sections. One with the code (read-only), one with the process specific state (read-write) (global variables) etc.

And only the "read-only" sections are shared.

In useful tool list, it's worth to mention MallocDebug too.

mikeash at 2009-06-20 23:39:45:
The Activity Monitor global system memory terms are clearly explained in the help. Please RTFM.

Chuck: Any given chunk of physical RAM could be mapped 0 or more times into the process.

For questions about CPU usage and such, I think you're in the wrong place. This is a programming blog.

Mike Smith at 2009-06-21 00:54:00:
@ssp

It's nice to pine for the "good old days", but the Darwin memory manager is nothing like the 9 memory manager, and Mac OS X apps tend to have more complex behaviours than their Classic ancestors, so the situation overall just isn't as straightforward anymore.

@natevw

As Jean-Daniel explains, shared libraries are internally subdivided based on (amongst other things) their writability. It's worth noting that the parts that might be writable but which have not yet been written remain shared using a technique called copy-on-write (often abbreviated COW).

@chuck

The accounting used to maintain those numbers is complicated, and they are not directly derived one from the others. It may help to know that most shared libraries are kept in what is known as the "shared segment", and the system uses a trick to share portions of the pmap and VM pagetable (data structures used by the MMU) that correspond to this segment between processes. This is a healthy performance optimisation, but it means that updates made to these data structures cannot be trivially accounted for across all processes.

In particular, RSIZE is very difficult to account precisely. If one task causes a page to be made present in a shared library, it's reasonable to charge that task for the page. If another task then uses that page, it might be reasonable to charge it for it as well, but there is no event that tells the system the second task has used the page. Further, when the page is later evicted and recycled, there is no list of all the tasks that used the page. Indeed, once it's made present, it's present in all tasks. Do you account a resident page in the shared segment against every task in the system? That would make the number useless, as one task running lots of code in a framework would make it look like every task in the system was blowing out its resident set.

Instead, RSIZE is managed using several tricks that try to make it 'relevant' at the cost of being 'precise'. You're encouraged to read the source code for the details.

@Miles

The act of "inspecting" a process causes a large chunk of its address space to be shared with the tool inspecting it, and so it goes from being accounted as VSIZE to VSHARED. You'll note that if you quit and restart Activity Monitor, the VSIZE accounting pops back up to where it was.

Andrew at 2009-06-22 21:13:10:
@scott: in fairness, re Firefox's historical memory leaks/allocation schemes...

http://dotnetperls.com/browser-memory and
http://dotnetperls.com/chrome-memory

This matches my experience, since the 3.0 release. FF has cleaned up its act.

e.g., Firefox running for 13 days, presently about 80 tabs open:


  PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
  460 firefox-bi 10.7% 31:29:37 26 277 6391 338M 50M 490M 939M


(CPU usage hovers between 10-20%, probably Flash and javascript cruft that just keeps churning in unviewed tabs)

I also find it useful to keep separate profiles (run /Applications/Firefox.app/Contents/MacOS/firefox-bin --profilemanager
 from Terminal), one profile for regular browsing with just a few plugins enabled, and another for web dev work with firebug, html validators, etc.

juancn at 2009-06-30 21:30:56:
One command that can be very useful finding which process is slowing down your machine is "iotop".

You run it like this:

~>sudo iotop -P

And it displays a % of disk IO each process is using :)

CPU usage is easy to see in activity monitor or by using top. But disk I/O is usually hard to track down.

Levi Figueira at 2009-07-10 10:57:34:
@juancn is SOOO right... Usually when the system is having a major slowdown, it is caused by high disk usage *not* RAM issues...

I keep wondering why a why to monitor per process disk activity is not integrated with the Activity Monitor (just the totals... not very helpful).

From my experience, the cause for major system slowdows are apps like Time Machine (local) or Backblaze (online) and other backup apps, or Spotlight and other search/indexing utilities... Those reaaaally bog down my machine and are really hard to track down/stop.

This article is such an enlightening read, but it comes down to: it's very hard to track *real* memory usage, because... there's no such thing! ;) haha It's soo complicated and "abstract"...

Thanks for the tip on iotop, juancn! :)

mikeash at 2009-07-10 11:14:18:
On my system, at least, the most common cause of IO-based slowdown is swap, which in turn is caused by memory exhaustion. I've seen Time Machine and other such things make the machine slow as well, but not as frequently and not as dramatically. Much will depend on your hardware and your individual usage habits, of course.


Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:
Web site:
Comment:
Formatting: <i> <b> <blockquote> <code>. URLs are automatically hyperlinked.
Hosted at DigitalOcean.