Use the right tools for the job I think. If you've got a vps with 256-512mb do not run heavy process forking applications when something else has been written exactly for this situation. I'm a big believer in apache, I think its great with mpm worker or event but prefork is really inefficient and in my opinion a legacy mpm. Unless you truly need apache (in our case custom written modules) then switch to lighttpd or nginx. These are proven to be extremely fast, highly configurable and low in memory usage.
I havent found an alternative to spamassassin but that thing consumes far too much memory and cpu than I like. If you find something else, use it. Its been around for a lifetime and someone needs to come out with a new solution.
If you have free or cached memory and think the kernel is swapping out your process then set a value between 0 and 100 in /proc/sys/vm/swappiness. 100 more likely to swap.
Here is another thing, look at whats running. The OS brings up alot of unneeded processes, stuff that you'll probably never use. Do ps aux and check it out. Or top and reorder by mem usage.
Apache itself isn't that much resource intensive, it is because of the modules that was bundled with OS packages. Compiling your own Apache with just the necessary modules would make it closer to the other web servers like lighttpd and nginx.
If time is money and you want easy configuration, spend the $200 on a VPS license for Litespeed. Since it can use the existing .htaccess and httpd.conf from apache, it really does take only 5 minutes to "upgrade" in most situations (and their support is 2nd to none).
But litespeed can get expensive for large installs so I don't blame people looking for free/opensource alternatives. All depends how much time you have and if you are building from scratch vs. upgrading an existing Apache install.
The great thing is, we have so MANY choices today compared to just 5-6 years ago, it's awesome.
Proprietary vs free software isn't just about money. It's about freedom, control and security.
I've often found myself needing to patch the software I use to get it to work just right. Even when a proprietary software vendor gives you source code the build system often sucks and the code is not hacker friendly.
Also the licensing would restrict you from doing all sorts of things you wouldn't have to think twice about with an open source web server (e.g., auto-scaling in a cloud configuration)
Unless you need the backwards compatibility with Apache don't use LiteSpeed. There are excellent open source alternatives which are just as good and perhaps superior. Minus the Apache compatibility.
True, if more hardware is cheaper it's always a better upgrade.
But litespeed will certainly double the capacity of any Apache install, no exaggeration, and it's ddos resistance is second to none. I just wish it wasn't so expensive.
I did something similar with my VPS's (nginx/MySQL/PHP/Django via flup), but at the same time this is not as easy as just disabling the pre-forked processes. The point of those processes is that they are available. Sure, if the services on your VPS are competing for resources then yes that would be an issue. However, I would recommend that you use all the available memory, but make sure that each process stays at that memory load.
For example I have a number of FastCGI PHP processes running, each consuming up to 20MB. I tuned it so that if each one takes exactly 20MB (more or less max memory limit I set), then I still have a sliver of RAM left. That way there is no cost of starting/stopping these processes and there is maximum resource utilization.
20 MB per Apache process? Granted, everyone's implementation is different, but if I had to guess, the author is relying on the results of `top` to determine his Apache memory utilization. This is fundamentally flawed because top doesn't accurately report physical memory usage[1]. It is an estimate at best. What you really want to look at (on Linux) is private dirty RSS, which top doesn't report.
For environments using Passenger, there is a great tool available for analyzing this: passenger-memory-stats. By examining the source code of this tool[2], we can see that it is examining the contents of `/proc/#{pid}/smaps' where pid is a collection of Apache process IDs that is iterated through. Writing a bash/awk script to accomplish the same should be pretty straight forward. But back to the topic of reducing memory usage and pre-forking.
In our deployments, an Apache process uses 0.5 MB of physical memory, which puts the reduction of Apache processes in to perspective. At this level, your Apache server would have to be grossly misconfigured for this to have a significant impact. Also, the purpose of spare processes is to disguise the overhead required to create new processes. The only reason you'd use this functionality is when you have a load that varies. That is to say, you should set your MaxClients configuration so that you don't spawn processes that your VPS can't support.
Any benefit of reducing pre-fork processes is lost once load increases to the point that you need those processes to serve requests. This cannot save you from an under-provisioned VPS instance. A better solution is to cap the maximum number of worker processes to limit swap usage (called thrashing). You'll still encounter a performance ceiling if you've under provisioned your VPS.
The same applies for any service that uses pre-forking. There is no substitute for understanding your service load requirements. You have to answer two questions: How many req/sec are you serving, and how many req/sec can each process/thread serve (depending upon your service worker model). This is accomplished using benchmarking tools.
While you should definitely make sure you're running an appropriate number of workers, you'll get more benefits from reducing the amount of memory used for each process. When I started, our Apache processes were around 7 MB of private dirty RSS usage each. The processes were large because there were all kinds of Apache modules loaded that weren't in use. PHP, mod_perl, etc. Each of these contribute to the process bottom line memory usage.
Let's look at memory usage as s * n where 's' is the size of each process and 'n' is the number of processes. Our variable 's' is typically a small number (say, 0.5 to 10 MB), while 'n' is typically a whole order of magnitude (or two) greater. In our case, 'n' is 150. Reducing 's' from 7 MB to 0.5 MB saved me 975 MB of real memory! If I had ignored my process size and only reduced the number of workers by 20% (which would still degrade performance at peak loads), I would only save 210 MB. The hit in performance cannot be understated. Running out of Apache workers is NOT good for performance.
In summary, the article offers good advice only in the fact that you should know and understand what your service requirements are. I would disagree that you should review your process/thread usage, and call it a day. There are many other 'low hanging fruit' items to reach out and grab.
Yes, since I'm only using Apache to serve static files, and we're running a Rails app, I only need a very limited number of modules. Here's a copy/paste of `passenger-memory-stats` from our staging environment (identical to production):
Reducing the number of pre-forked processes shouldn't matter that much (at least with Apache). Your kernel implements copy-on-write when forking processes for all the memory that needs to be duplicated, so it doesn't actually use n times the memory, where n is the amount used by a single process -- in my experience web servers usually don't change much of the memory while serving content.
Spamassassin might do this differently though and I suspect child processes are more likely to use all the reported memory.
Maybe I'm a bit crazy, but I just disable swap all together. With 2 tornado servers, 2 nodejs servers, nginx, postgres, memcache and monit I end up using 170M of memory. (None of the sites are data heavy, so this may be skewed.)
I may be fundamentally misunderstanding something here, but doesn't turning off swap pretty much guarantee that things will start failing if you have a spike in memory usage? Assuming you have something like 256M of RAM (like many small VPSs) running at 170M means that a relatively small spike could cause things to start having memory errors. Is there some way to mitigate this, other than enabling swap?
That's a lot easier said than done. Unless you use ulimit to strictly enforce process memory limits (tricky) it's nearly impossible to know the how much memory a large complicated program (like MySQL) will allocate.
There was a "Is swap necessary" discussion on lkml and kerneltrap[1] a while ago and iirc, someone in the know had said that the kernel does expect to have some swap to work with. So, I run all my machines with about 10M of swap space.
Also as you try to maximise you memory usage you run the risk of the Out of Memory (OOM) Killer running which just kills of processes and they don't return, till you start them again. It's not nice.
Spamassassin is a resources hog that shouldn't be used in a VPS with limited memory.
There are far better options like greylisting or just adding spam blacklist services (spamhouse, spamcop) to the "reject" configuration of the mail server.
I'm surprised that greylisting still works, to be honest. But the blacklist services never did, at least if your goal is not to reduce spam per se but to make or keep email useful.
SA is easy to integrate in a mail server but not "easier" than adding one line to the server configuration. SA lets many spam go through, it's a pain in the neck to fine-tune it and when you have a lot of spam coming in it uses a lot of CPU for parsing. With blacklist servers spam is dropped at the beginning, there's no further parsing etc.
Basically I don't know why anyone nowadays would use spamassassin (I've used it in the past) when there's graylisting and blacklist servers that work wonderfully with low overhead.
Use the right tools for the job I think. If you've got a vps with 256-512mb do not run heavy process forking applications when something else has been written exactly for this situation. I'm a big believer in apache, I think its great with mpm worker or event but prefork is really inefficient and in my opinion a legacy mpm. Unless you truly need apache (in our case custom written modules) then switch to lighttpd or nginx. These are proven to be extremely fast, highly configurable and low in memory usage.
I havent found an alternative to spamassassin but that thing consumes far too much memory and cpu than I like. If you find something else, use it. Its been around for a lifetime and someone needs to come out with a new solution.
If you have free or cached memory and think the kernel is swapping out your process then set a value between 0 and 100 in /proc/sys/vm/swappiness. 100 more likely to swap.
Here is another thing, look at whats running. The OS brings up alot of unneeded processes, stuff that you'll probably never use. Do ps aux and check it out. Or top and reorder by mem usage.