OK, as a first step I installed Munin (http://munin.projects.linpro.no/
) on a few machines. It's a graphical trending system somewhat like Cacti, but it uses check plugins on each individual machine, and RUNS OUT OF THE BOX. No config, just drop the plugins in a directory and it checks them.
Took me about an hour to install, but now I have a pretty good picture of resource usage, and more importantly what's using what. I downloaded a plugin that gives CPU usage % by user, for a specified list of users, and also wrote one that does the same for memory.
Hopefully after a few days, I'll start to be able to piece together the picture from all of my machines and figure out what's going to be needed.