Oct. 27, 2009, 6:47 p.m. memory usage limits

So i decided to make a little different post than usual, a little more thoughts based than tutorial alike. I would like to take a brief overview of todays memory usage for various tasks and scripts. Since large part of my the day is involved in web servers and their management I will mainly focus on memory usage for web applications and scripts.

Not so long ago, having a server with 4Gb of  working memory was a luxury, today we have certain scripts consuming about 512Mb of memory while running. What changed? Internet boom, popularity of web 2.0 applications, ease of development, bunch of those learn programing in 21 days books and tutorials, that is what happened. In addition there is a whole bunch of people from “I want Internet” generation who still don't quite grasp the difference between RAM and Disk memory, not to mention they don't quite grasp the inner mechanics of computer systems.

Sir... the white smoke got out of your metal box... you must refill it with white smoke to make it work again
Well, like it or not those people like to call themselves web masters and web developers.

I'm not claiming to be an über-pro programmer or that I posses some super natural insight in programing, but I usually do like to use common sense when I'm involved in developing something.

Over the time I witnessed some pretty messed up code. Ranging from abstract syntax to heavily demented processing logic. Mainly in web development problems lie in those books teach yourself php in 3 hours. As it is covered in this fantastic article learning some programing language takes time. You can't just learn to program in few days. What you can do is learn some basic syntax and few useful functions. First problem I encountered trough passed few years with such developers is url_fopen. I covered those cons and pros of curl vs fopen article, but the point is learning PHP from 21 days book will never explain you this.

Same thing goes here with memory management and some script logic. Usually those books tend to have simplistic approach, and usually don't bother much with script optimizations for heavy usage. Learning PHP in that way new programmers tend to write unoptimized code.

Few years back I had a “clash” with one of those newbie developers. He actually developed something on his local machine, tested it and released it into public. This even worked for some times while he was in some sort of marketing phase still advertising his new product. At some point his php script started to throw “memory exhausted” errors. After a short while he contacted me and said

Hey... what is wrong with your web server? I still have a plenty of disk space available, why is he throwing this error? You must have miss-configured it!
Doh! Well I'll skip the part of explaining him the difference between memory and disk space, and few other arguments and get to the point what was wrong with the script. Script was tested in lab environment, populated with only basic entries and not so well tested with multiple users requesting multiple data, with extensive data growth over time. The script literary massacred the MySql server with some joins on tables without any indexes ranging in size from 40-50Mb in size an few millions of records. Needles to say almost all the data was put into php array for later processing thus exceeding the php memory limit. While there was less data and less users script worked since the array didn't leave the configured memory limit.

Usually joins in mysql query can be avoided, and server will return the requested data faster. But the root cause of his problem reporting was this php memory limit. And the simple answer for this problem is partial data processing.

Get some data from mysql -> put it to variable -> process it -> free the memory -> repeat the process until done
With proper query you will give some slack to mysql server, you will lower the memory usage of the script and so on....

Although this type of problem with “one man band” developers making some applications for themselves is a problem, a much larger problems are those kinds of developers hiding behind some big and widely spread open source web applications like Joomla, Drupal, phpBB etc and even some paid web applications.

Here we have a problem with people not yet involved with programming but they still own and make new sites based on some of those free open source applications. Applications are so widely used across huge amount of sites that they gained some kind of authority status saying we make good stuff. Problem here is since it is open source any developer (even those newbie ones) can contribute to the source. Not so much to the core of the application but as a plugin. Plugins are installed separately and usually involve some risk of bad developer practice.

One of such plugins that I will take for an example (nothing personal it's just a fresh memory) is  jom_sef. While very useful for SEO optimizations of certain parts of site this plugin actually loads every time a visitor loads or reloads your site. This plugin also reads a bunch of records from his mysql table and puts it into some sort of array. Remember my newbie developer from before? This plugin generally works while there is a low count and volume of data that he needs to process. But at some point it stops, and users find themselves in weird situation.

First thing to do for them are to contact their host provider with the same question

Hey, this thing worked all this time! What did you do to the server? Make it work.
Some may bother explaining to the customer what just happened, and they will accept it for what it is. Some will just increase the memory limit until next time. And some will just fight the never ending battle with the customer. Customer will complain on the public forums, community or even the developer of the plugin itself will advise customer to change the company since it all works like a charm (yes but on low traffic, no data website).

Now regardless of the battle outcome, let us review this from the server side. Weather you have shared or dedicated server of your own you have to consider this problem from the server side.

Your dedicated server will have for example 4Gb of RAM. Let's say that the mysql server is located on the same machine and it is using about 600Mb to 1Gb of memory depending of the usage. Let's say we will take aprox. 300-400Mb of memory for basic system usage, and that much for mail system. That leaves us with aprox. 2Gb of memory for our webserver. So let's enable the jom_sef and let us give it the 64Mb memory limit per php child. Since jom_sef will load each time visitor comes to our site and each time it will consume near 64Mb of memory that leaves us with 32 concurrent users.

Well what's wrong here? Is it the server and it's setup... or is it the script?

Fact is servers are getting bigger and bigger each day. It's not so uncommon to find a +24Gb of RAM shared hosting servers which may tolerate your memory consumption until some point. But when you reach that limits shared hosting servers will usually cut you off. Increasing memory limits beyond that is a risk. Memory overallocation will most certainty bring your server into unusable state.

Common sense is very uncommon. ~Horace Greeley
Common sense from one of those people, that believes “white smoke” is the power behind the computers, is that something is terribly wrong with the server. He is paying much more for his dedicated server than his colleague for his shared hosting, but his script is not working, server is “dying”!
Well if you are a “white smoke” believer let me try it this way. You will pay a great deal of money for your Ferrari, but if you take it to the woods and chop some trees then hook up let's say 32 of those trees onto the back of your Ferrari, will it move?
On the other hand we have some great software and plugins like for example Wordpress and wp-supercache where we can run entire system, mysql and our website in just a 512Mb of RAM for bunch of concurrent users since they will be served with pre-generated static pages.

So the conclusion I guess would be:

If you believe in “white smoke” please hear the advice from your sysadmin. If you on the other hand just finishing with your “Learn php in 21 days” book, please don't jump into developing some public available software. Make as little damage as you can. Keep the planet green, reduce the number of servers needed for running the web.

Hope this helps someone to make some sense into some developers and users. And for all of you frustrated with the same things please leave the message after the beep!

--- Beeeep ---

blog comments powered by Disqus