Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Reducing the load on a small VPS by 80% in 5 minutes (turnkeylinux.org)
53 points by liraz on June 8, 2010 | hide | past | favorite | 35 comments


Summary: Reduce memory by running less processes

Use the right tools for the job I think. If you've got a vps with 256-512mb do not run heavy process forking applications when something else has been written exactly for this situation. I'm a big believer in apache, I think its great with mpm worker or event but prefork is really inefficient and in my opinion a legacy mpm. Unless you truly need apache (in our case custom written modules) then switch to lighttpd or nginx. These are proven to be extremely fast, highly configurable and low in memory usage.

I havent found an alternative to spamassassin but that thing consumes far too much memory and cpu than I like. If you find something else, use it. Its been around for a lifetime and someone needs to come out with a new solution.

If you have free or cached memory and think the kernel is swapping out your process then set a value between 0 and 100 in /proc/sys/vm/swappiness. 100 more likely to swap.

Here is another thing, look at whats running. The OS brings up alot of unneeded processes, stuff that you'll probably never use. Do ps aux and check it out. Or top and reorder by mem usage.


Or, more than 5 minutes but worth the effort: replace Apache.

Nginx, Lighttpd, Cherokee, Zeus, LiteSpeed, etc. All use far fewer resources.


I've setup nginx as a reverse proxy to Apache, and serving static media directly, & it's much more stable while using less memory.


Has anyone here used Cherokee and would like to share the experience? It caught my eye, but I'm not much of an early adopter.


Apache itself isn't that much resource intensive, it is because of the modules that was bundled with OS packages. Compiling your own Apache with just the necessary modules would make it closer to the other web servers like lighttpd and nginx.


+1 for lighttpd. nginx is nice too but its configuration system is much more primitive by comparison.


If time is money and you want easy configuration, spend the $200 on a VPS license for Litespeed. Since it can use the existing .htaccess and httpd.conf from apache, it really does take only 5 minutes to "upgrade" in most situations (and their support is 2nd to none).

But litespeed can get expensive for large installs so I don't blame people looking for free/opensource alternatives. All depends how much time you have and if you are building from scratch vs. upgrading an existing Apache install.

The great thing is, we have so MANY choices today compared to just 5-6 years ago, it's awesome.


Proprietary vs free software isn't just about money. It's about freedom, control and security.

I've often found myself needing to patch the software I use to get it to work just right. Even when a proprietary software vendor gives you source code the build system often sucks and the code is not hacker friendly.

Also the licensing would restrict you from doing all sorts of things you wouldn't have to think twice about with an open source web server (e.g., auto-scaling in a cloud configuration)

Unless you need the backwards compatibility with Apache don't use LiteSpeed. There are excellent open source alternatives which are just as good and perhaps superior. Minus the Apache compatibility.


Never used Litespeed, but wonder if you should just buy more VPS for the $200.


True, if more hardware is cheaper it's always a better upgrade.

But litespeed will certainly double the capacity of any Apache install, no exaggeration, and it's ddos resistance is second to none. I just wish it wasn't so expensive.


> +1 for lighttpd. nginx is nice too but its configuration system is much more primitive by comparison.

http://agentzh.org/misc/slides/nginx-conf-scripting/nginx-co... disagrees with your assessment of the nginx configuration system.


-1 for lighttpd, it's fcgi handler has an old, unresolved bug in it that causes it crash under very high loads, use nginx


I was able to reduce memory of mysqld on a 256MB VPS from 5.2% to 2.9% by adding this to /etc/my.cnf

  skip-bdb
  skip-innodb

  # default is 8M
  key_buffer_size=4M


I did something similar with my VPS's (nginx/MySQL/PHP/Django via flup), but at the same time this is not as easy as just disabling the pre-forked processes. The point of those processes is that they are available. Sure, if the services on your VPS are competing for resources then yes that would be an issue. However, I would recommend that you use all the available memory, but make sure that each process stays at that memory load.

For example I have a number of FastCGI PHP processes running, each consuming up to 20MB. I tuned it so that if each one takes exactly 20MB (more or less max memory limit I set), then I still have a sliver of RAM left. That way there is no cost of starting/stopping these processes and there is maximum resource utilization.


20 MB per Apache process? Granted, everyone's implementation is different, but if I had to guess, the author is relying on the results of `top` to determine his Apache memory utilization. This is fundamentally flawed because top doesn't accurately report physical memory usage[1]. It is an estimate at best. What you really want to look at (on Linux) is private dirty RSS, which top doesn't report.

For environments using Passenger, there is a great tool available for analyzing this: passenger-memory-stats. By examining the source code of this tool[2], we can see that it is examining the contents of `/proc/#{pid}/smaps' where pid is a collection of Apache process IDs that is iterated through. Writing a bash/awk script to accomplish the same should be pretty straight forward. But back to the topic of reducing memory usage and pre-forking.

In our deployments, an Apache process uses 0.5 MB of physical memory, which puts the reduction of Apache processes in to perspective. At this level, your Apache server would have to be grossly misconfigured for this to have a significant impact. Also, the purpose of spare processes is to disguise the overhead required to create new processes. The only reason you'd use this functionality is when you have a load that varies. That is to say, you should set your MaxClients configuration so that you don't spawn processes that your VPS can't support.

Any benefit of reducing pre-fork processes is lost once load increases to the point that you need those processes to serve requests. This cannot save you from an under-provisioned VPS instance. A better solution is to cap the maximum number of worker processes to limit swap usage (called thrashing). You'll still encounter a performance ceiling if you've under provisioned your VPS.

The same applies for any service that uses pre-forking. There is no substitute for understanding your service load requirements. You have to answer two questions: How many req/sec are you serving, and how many req/sec can each process/thread serve (depending upon your service worker model). This is accomplished using benchmarking tools.

While you should definitely make sure you're running an appropriate number of workers, you'll get more benefits from reducing the amount of memory used for each process. When I started, our Apache processes were around 7 MB of private dirty RSS usage each. The processes were large because there were all kinds of Apache modules loaded that weren't in use. PHP, mod_perl, etc. Each of these contribute to the process bottom line memory usage.

Let's look at memory usage as s * n where 's' is the size of each process and 'n' is the number of processes. Our variable 's' is typically a small number (say, 0.5 to 10 MB), while 'n' is typically a whole order of magnitude (or two) greater. In our case, 'n' is 150. Reducing 's' from 7 MB to 0.5 MB saved me 975 MB of real memory! If I had ignored my process size and only reduced the number of workers by 20% (which would still degrade performance at peak loads), I would only save 210 MB. The hit in performance cannot be understated. Running out of Apache workers is NOT good for performance.

In summary, the article offers good advice only in the fact that you should know and understand what your service requirements are. I would disagree that you should review your process/thread usage, and call it a day. There are many other 'low hanging fruit' items to reach out and grab.

1 - http://www.google.com/search?q=linux+top+vs+private+dirty+rs...

2 - http://github.com/FooBarWidget/passenger/blob/master/bin/pas...


how did you decrease 7 MB to 0.5 MB? only by removing unneeded modules?


Yes, since I'm only using Apache to serve static files, and we're running a Rails app, I only need a very limited number of modules. Here's a copy/paste of `passenger-memory-stats` from our staging environment (identical to production):

http://pastie.org/996770


Reducing the number of pre-forked processes shouldn't matter that much (at least with Apache). Your kernel implements copy-on-write when forking processes for all the memory that needs to be duplicated, so it doesn't actually use n times the memory, where n is the amount used by a single process -- in my experience web servers usually don't change much of the memory while serving content.

Spamassassin might do this differently though and I suspect child processes are more likely to use all the reported memory.


Thanks for the tip...took me less than 5 minutes, and saved me from having to upgrade my VPS size like I had previously planned


Maybe I'm a bit crazy, but I just disable swap all together. With 2 tornado servers, 2 nodejs servers, nginx, postgres, memcache and monit I end up using 170M of memory. (None of the sites are data heavy, so this may be skewed.)


I may be fundamentally misunderstanding something here, but doesn't turning off swap pretty much guarantee that things will start failing if you have a spike in memory usage? Assuming you have something like 256M of RAM (like many small VPSs) running at 170M means that a relatively small spike could cause things to start having memory errors. Is there some way to mitigate this, other than enabling swap?


Things will start "failing" from a user's perspective if you have to hit the disk for every transaction anyway (i.e. most requests will time out.)


Size your server properly and know the memory requirements of your application. That way you can guarantee that you will never hit swap.


That's a lot easier said than done. Unless you use ulimit to strictly enforce process memory limits (tricky) it's nearly impossible to know the how much memory a large complicated program (like MySQL) will allocate.


With no swap OOM Killer will kick in if you run out of RAM. It will kill a process in order to free up memory.


There was a "Is swap necessary" discussion on lkml and kerneltrap[1] a while ago and iirc, someone in the know had said that the kernel does expect to have some swap to work with. So, I run all my machines with about 10M of swap space.

[1] http://kerneltrap.org/node/3202


Also as you try to maximise you memory usage you run the risk of the Out of Memory (OOM) Killer running which just kills of processes and they don't return, till you start them again. It's not nice.


Spamassassin is a resources hog that shouldn't be used in a VPS with limited memory.

There are far better options like greylisting or just adding spam blacklist services (spamhouse, spamcop) to the "reject" configuration of the mail server.


I'm surprised that greylisting still works, to be honest. But the blacklist services never did, at least if your goal is not to reduce spam per se but to make or keep email useful.


I disagree. URI blacklists in SA have stopped a ton of spam from reaching me with nary a false positive.


Not true. If you cut down the number of preforked processes and use spamd spamassassin works just fine in a VPS with limited memory.

Also, SA leverages blacklist services (and other techniques, it's very configurable) and is easier to integrate into your mail server.


SA is easy to integrate in a mail server but not "easier" than adding one line to the server configuration. SA lets many spam go through, it's a pain in the neck to fine-tune it and when you have a lot of spam coming in it uses a lot of CPU for parsing. With blacklist servers spam is dropped at the beginning, there's no further parsing etc.

Basically I don't know why anyone nowadays would use spamassassin (I've used it in the past) when there's graylisting and blacklist servers that work wonderfully with low overhead.


People still run Apache on VPSes?

I've started a web server poll: http://news.ycombinator.com/item?id=1414076


To note that even though a preforked process shows 30mb of memory usage, it may be using as little as 1mb -- while the rest is shared memory.


I used TomCat in a 256M VPS with no problem with loads. Threading definitely helps in sharing the same memory space.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: