I want to know when building a typical site on the LAMP stack how do you optimize it for the best possible load times. I am picturing a typical DB-driven site.
This is a high-level look and could probably pull in question and let me break it down into each layer of the stack.
L – At the system level, (setup and filesystem) can you do to improve speed? One thing I can think of is image sizes, can compression here help optimize anything?
A – There have to be a ton of settings related to site speed here in the web server. Not my Forte. Probably depends a lot on how many sites are running concurrently.
M – MySQL in a database driven site, DB performance is key. Is there a better normalization approach i.e, using link tables? Web developers often just make simple monolithic tables resembling 1NF and this can kill performance.
P – aside from performance-boosting settings like caching, what can the programmer do to affect performance at a high level? I would really like to know if MVC design approaches hit performance more than quick-and-dirty. Other simple tips like are sessions faster than cookies would be interesting to know.
Obviously you have to get down and dirty into the details and find what code is slowing you down. Also I realize that many sites have many different performance characteristics, but let’s assume a typical site that has more reads then writes.
I am just wondering if we can compile a bunch of best practices and fully expect people to link other questions so we can effectively workup a checklist.
My goal is to see if even in addition to the usual issues in performance we can see some oddball things you might not think of crop up to go along with a best-practices summary.
So my question is, if you were starting from scratch, how would you make sure your LAMP site was fast?
General web page loading speed and performance best practices
What are some general (not specific to LAMP, .NET, Ruby, mySql, etc) tactics and best practices to improve page loading speed? I am looking for tips about caching, HTTP headers, external file minific
Best practices to compile an iPhone app optimizing it for maximum speed?
I was wondering what people is doing in order to compile their iPhone applications optimizing them for maximum speed rather than size or the tradeoff speed versus size.
UITableView Best Practices [closed]
What are the best practices when dealing with UITableViews in order to improve the performance, speed up the development and maintenance when dealing with UITableViews?
Seam Best practices [closed]
Is there any document or article regarding the seam best practices.
WPF Best Practices? [closed]
I am looking for best Practices in WPF Application development; I would really appreciate any good pointers
Django caching in authenticated sites: best practices
I need to add memcached to my django website. It’s an authenticated website, where different users see different data on the same pages. Which are the best practices? I mean, to avoid users to see eac
What are sites for Hadoop Best practices
What are sites for Hadoop Best practice , Not the Books where I can get the step by step process to create new projects and small examples . I am not able to find a single site like this , please shar
What are the best practices to speed up site development with CMS? [closed]
How can I speed up site development with a particular CMS if I need to build a lot of sites? Should I prepare a few different solutions based on barebone CMS but with pre-built components and deploy t
NoSQL best practices [closed]
What are the best practices for NoSQL Databases, OODBs or whatever other acronyms may exist for them? For example, I’ve often seen a field type being used for deciding how the DB document (in couchD
SSRS Best practices [closed]
I am in the process of setting up developer standards for SSRS, I would like to know from the group if any one have SSRS best practices or links with them?
I’d recommend starting with http://highscalability.com/
As for your suggestions:
Compression for images, definitely no. Type of files system tunning, yes, that could have some effect, but minimal. But actually the best is to use in-memory reverse proxy, or even better CDN.
For Apache basically only load the modules you need. Do not load anything else. As with PHP you can only use forking MPM, it’s important to keep it slim. As for optimal settings, well you have to fine tune them to specific application, hardware etc. If you have enough CPU, it’s recommendable that you use mod_deflate. Faster the server can send data to the client, faster it can start processing next request.
I’ve used MysqlTuner for performance analysis on my mysql servers and its given a good insight into further issues for googling, as well as making its own recommendations
A resource you might find helpful is the YDN set of performance rules.
Don’t forget the fact that your users will be thousands of miles away from your server, and downloading dozens of files to render a single page. That latency, and the overhead of rendering the page in their browsers can be larger than the amount of time that you spend collecting the information, and generating the page.
Here’s a few personal must-dos that I always set up in my LAMP applications.
Be careful with .htaccess files! Enabling .htaccess files for directories in your app means that Apache has to scan the filesystem constantly, looking for .htaccess directives. It is far better to put directives inside the main configuration or a vhost configuration, where they are loaded once. Any time you can get rid of a directory-level access file by moving it into a main configuration file, you save disk access time.
Prepare your application’s database layer to utilize a connection manager of some sort (I use a Singleton for most applications). It’s not very hard to do, and reducing the number of database connections your application opens saves resources.
If you think your application will see significant load, memcached can perform miracles. Keep this in mind while you write your code… perhaps one day instead of creating objects on the fly, you will be getting them from memcached. A little foresight will make implementation painless.
Once your app is up and running, set MySQL’s slow query time to a small number and monitor the slow query log diligently. This will show you where your problem queries are coming from, and allow you to optimize your queries and indexes before they become a problem.
For serious performance tweakers, you will want to compile PHP from source. Installing from a package installs a lot of libraries that you may never use. Since PHP environments are loaded into every instance of an Apache thread, even a 5MB memory overhead from extra libraries quickly becomes 250MB of lost memory when there’s 50 Apache threads in existence. I keep a list of my standard ./configure line I use when building PHP here, and I find it suits most of my applications. The downside is that if you end up needing a library, you have to recompile PHP to get it. Analyze your code and test it in a devel environment to make sure you have everything you need.
Be prepared to move static content, such as images and video, to a non-dynamic web server. Write your code so that any URLs for images and video are easily configured to point to another server in the future. A web server optimized for static content can easily serve tens or even hundreds of times faster than a dynamic content server.
That’s what I can think of off the top of my head. Googling around for PHP best practices will find a lot of tips on how to write faster/better code as well (Such as: echo is faster than print).
Don’t forget to turn off atime for your filesystem!
First, realize that performance is an iterative process. You don’t build a web application in a single pass, launch it, and never work on it again. On the contrary, you start small, and address performance issues as your site grows.
Now, onto specifics:
- Profile. Identify your bottlenecks. This is the most important step. You need to focus your effort where you’ll get the best results. You should have some sort of monitoring solution in place (like cacti or munin), giving you visibility into what’s going on on your server(s)
- Cache, cache, cache. You’ll probably find that database access is your biggest bottleneck on the back end — but you should verify this on your own. Fortunately, you’ll probably find that a lot of your traffic is for a small set of resources. You can cache those resources in something like memcached, saving yourself the database hit, and resulting in better backend performance.
- As others have mentioned above, take a look at the YDN performance rules. Consider picking up the accompanying book. This’ll help you with front end performance
- Install PHP APC, and make sure it’s configured with enough memory to hold all your compiled PHP bytecode. We recently discovered that our APC installation didn’t have nearly enough ram; giving it enough to work in cut our CPU time in half, and disk activity by 10%
- Make sure your database tables are properly indexed. This goes hand in hand with monitoring the slow query log.
The above will get you very far. That is to say, even a fairly db-heavy site should be able to survive a frontpage digg on a single modestly-spec’d server if you’ve done the above.
You’ll eventually hit a point where the default apache config won’t always be able to keep up with incoming requests. When you hit this wall, there are two things to do:
- As above, profile. Monitor your apache activity — you should have an idea of how many connections are active at any given time, in addition to the max number of active connections when you get sudden bursts of traffic
- Configure apache with this in mind. This is the best guide to apache config I’ve seen: Practical mod_perl chapter 11
- Take as much load off of apache as you can. Apache’s too heavy-duty to serve static content efficiently. You should be using a lighter-weight reverse proxy (like squid) or webserver (lighttpd or nginx) to serve static content, and to take over the job of spoon-feeding bytes to slow clients. This leaves Apache to do what it does best: execute your code. Again, the mod_perl book does a good job of explaining this.
Once you’ve gotten this far, it’s largely an issue of caching more, and keeping an eye on your database. Eventually, you’ll outgrow a single server. First, you’ll probably add more front end boxes, all backed by a single database server. Then you’re going to have to start spreading your database load around, probably by sharding. For an excellent overview of this growth process, see this livejournal presentation
I’d recommend using Jet Profiler for MySQL to find any bad queries. I’ve successfully used it on a couple of my sites. Really helpful, and much easier to digest than the slow query log.