Developing a scalable Symfony 2 application

Introduction

One year ago, when I joined the Motor Presse GmbH & Co. KG, a well established and renowned publishing house in Germany, most of the magazine's web presences based on a local CMS. The CMS is quite flexible and was - for the time it was developed around 4-6 years prior - a good choice to use. But the main concepts for caching and the programming paradigms used are quite ancient and there's no real hint on any improvements in the future. So my base setup was clear: Old CMS, really slow (SOAP) interface for decoupling, tons of business logic in templates and webpages that grew organically over a long timespan. Time for a change. One thing was clear: Since most editors were quite proficient using the established CMS for creating their content, this was not a thing we could easly change. But we needed new ways of doing things: A version control, continuous integration, Testing, different test/stage systems and most of all a common development VM with an according IDE.

When we got the task to build a relaunch for menshealth.de, we made it our mission to completely restructure our team, development processes and systems. And this is how it went.

First of all: We are a small development team, consisting of 4 developers and one designer who worked nearly exclusively for the relaunch project and the experience levels of working with more sophisticated frameworks like Symfony2 varies a lot!

First thoughts on infrastructure and design

On first sight the main performance bottlenecks were the (normalized) MySQL database and the non-maintainable templating structure/handling. Facing these facts and knowing we could not move to another CMS in time, we decided to split the future application into three parts. First, the CMS as it was, to keep our editors satisfied and the migration manageable - also this gives us the opportunity to keep our existing infrastructure. Second an API (REST... what else) for data transformation, access and also later for external data exchange plus user authentication. Third and last part is a frontend application to visualize the data from our API.

Frameworks/technologies

Because I am a great fan of Symfony 2, the choice concerning the framework was quite clear. And having a need for fast and structured storage of complex data, we decided to try MongoDB which I had in use in other projects before and was quite satisfied with. As we decided to split the application we've also gained the possibility to asynchronously work our data transformation tasks from CMS to API. RabbitMQ and the AMQP library were logical choices, since they are stable, performant and integrate greatly into a sf2 project.

Building the application infrastructure and deployment

Having defined our components and after long talks with our server provider, we came up with following setup: (image) A varnish proxy/cache with failover in front of two webservers for loadbalancing, two MongoDB instances with an arbiter, one Memcache instance each for session sharing, one API server and the already existing MySQL cluster with one CMS (backend) server. Images are delivered by a small "micro-image-server" from a mounted NFS, images uploaded via the backend CMS. This setup is managed by puppet.

QA, release & deployment

For release and deployment we've decided to go for "we do everything in the master" during the hot phase of the project. As soon as things cooled down, we've switched to the more (process-)secure gitflow. For Deployments over all systems we use ansible with a custom role, based on ServerGrove's symfony2 deployment.

Environments:

  • dev -> virtual machine with vagrant and ansible, local for each developer.
  • stage -> separate server.
  • preview -> separate server with production settings.
  • prod -> production servers.

environments overview

Our QA flow now consists of these components:

  • Jira ticket, approved and assigned by project management.
  • Development in feature branch (git flow) on a local VM.
  • develop branch is checked by Jenkins/PHPCI for every commit and then auto-deployed to test server. First acceptance test stage.
  • Release branches are also checked by Jenkins and then deployed to preview system, second acceptance test stage.
  • master branch is checked by jenkins after every merge, manually deployed to production.

What we did for scalability

We started with a short list of known performance bottlenecks/requirements for our special case:

  • Slow CMS backend due to loads of normalizing in MySQL and bad application architecture php-wise.
  • Frontends need no direct write access to database (security & speed!).
  • CMS and frontend can have slight asynchronicity.
  • Frontend needs to be fast without varnish and therfore allow for a higher content update frequency.
  • Varnish cache times should be manageable through frontend responses.
  • Make it easy to add new frontend servers if load increases (vertical scaling).
  • Keep backend decoupled but allow for vertical scaling there, too.

Looking at those requirements, we see: Nothing really new or unfamiliar. This list lead to following architecture:

  • CMS totally decoupled, triggers REST API calls for every CUD action with minimal payload. Noticed the missing "R"? Reads are done directly from DB for performance reasons.
  • API application processes messages with RabbitMQ/AMQP to keep load manageable.
  • API extracts and transforms data from MySQL server to MongoDB documents.
  • Data goes into a MongoDB cluster.
  • Frontends access MongoDB directly, but readonly.

simplified infrastructure overview

Some more ideas we've sucessfully to the test:

  • Symfony controllers are used as services (see Mathias Noback). This allows for easier testing and much better dependency visibility!
  • One-Bundle approach as advertised in best practices, modified to have src/CoreBundle and src/Core for our library classes.
  • gulp/stylus for css/js management.
  • Removed Assetic, since it is not necessary when using gulp.
  • Using Symfony Response object and custom configurations for ETag/cache lifetime.
  • APCu cache for userland caching.
  • All monolog logging is pulled by Graylog via logfile (GELF) -> Awesant -> RabbitMQ.
  • The library is connected to coreBundle via services (providers) which are only handlers for private services, collected during compiler passes. This has a huge impact on DIC size and performance.
  • All access to CMS data is hidden behind a facade with interfaces and a provider structure that allows for different API versions and access for a (possible) variety of CMS systems per client.

Pitfalls

During the time we've encountered some (from hindsight) funny errors:

  • Using APCu, always check how much memory consumption you have, how quickly the cache builds and flushes. Check hitrates! In our case the cache filled up and flushed so quickly, you couldn't see it on monitoring. But hitrates show ;-)
  • Logging! I can't emphasize enough: Every crucial action that fails should be logged - but with enough meta-information to be conclusive!
  • If you're using the reverse proxy cache kernel, make sure it uses the same strategy (e.g. key) as your varnish does. We had the user agent in varnish cache key, but not within the vary headers - so the reverse proxy cache eratically cached desktop/mobile sepcific stuff and people got a lot of wrong ads. Took me several hours to find :-(

Conclusion

When we first load-tested the application, we were quite surprised by the good overall performance. This behavior continued with the go-live - with some hickups with the cache keys for varnish. We're quite satisfied with our current architecture, especially since we decoupled most of the code from the framework itself and placed it into library structures. But there is still a lot of work to do: We've got a low UnitTest coverage, some of the library parts should be excluded as separate libraries and repositories and also - and this is true for nearly every project - a lot of refactoring has to be done in templating and some library parts.

And this is what we've built: menshealth.de

Tags: php, symfony2, software architecture

XCache made me lose my nerves!

Another year, another trial... But let's start from the beginning, shall we?

Introduction

Since my last post a lot has changed in way of development environment. This is due to my new job @ Motorpresse Stuttgart where I started as lead developer in november, last year. Since the server provider is quite fond of Ubuntu and prefers the "good old apache" stuff, I had to create new virtual machines, ansible roles and tons of configuration to match the server's state. Most annoying: Apache2 (2.2 and 2.4) with either mod_php5 and php-fpm. So I decidede to set up the upcoming refactoring/redesign of serveral projects with a symfony2 skeleton and API application. One more constraint for the environment is XCache, which I know to be slower than APC and less reliable. But whatever floats the admin's boat...

The application

Main goal of the new application design is to decouple an old CMS with messy data structure and even worse templating from a - yet to be finished - modern symfony2/gulp/sass driven frontend. The data access is asynchronous with triggered CRUD events, ending in a message queue where they are used to manage the separate frontend data structure. This results in me having two instances of symfony2 running on one machine.

XCache introduces itself

By default the new Zend Opcode Cache module is disabled in php.ini, so no crossfire between XCache and php. So I thought. Having the frontend runnig, I started work on the API and when I tried to call both frontends at once, a "AppKernel cannot be redeclared" error occured. Ok, strange... I only have one AppKernel per project and since the classes are loaded via absolute path...

Ways to solve the problem

An hour later, I had divided my two apps into two php-fpm pools before realizing that all caching takes place in the master process. At that point I deactivated XCache -> everyting works fine. OK, XCache variable cache on... Everything works fine. Conclusion: I need another opcode cache. XCache is able to run with disabled opcode cache - thank god - and the Zend opcode cache is not crashing when being enabled at the same time.

The whole process of figuring out why my app crashed in the first place and therefore fixing it, took me the better part of one day. Great stuff. Really pissed by XCache.

Tags: php, symfony2, software architecture, caching, XCache

How to Bundle!

This post ist partly a recap of the symfony life 2014 in Berlin where Toni (@havvg) and I did a "Lightning-Rant" on this topic. For some time now, we've been sporadically discussing the placement of models in the "VC-style-framework" symfony2. SFUGSTR did a whole meeting about that back in 2013 with great and controversial discussion :-) And since the new "Best Practices" book for development with symfony has emerged, I think it's time to write a little something myself.

Basics

As of now we've witnessed a lot of developers in projects using bundles to structure their applications, which - in itself - is no bad thing per se. Especially when starting to develop with symfony2 for the first time, it is not that easy to fully grasp the abstractness of the architecture and structure code accordingly. The basic idea of a bundle is having a container to put alle framework-related or -coupled stuff in a dedicated place where it can be found and interpreted. That's it! Framework stuff! A lot of people start packing "utilities", models, ... into a "CoreBundle" or "AppBundle" and that's not right (from an architectural point of view). To solve this misery there are a few approaches and I am going to discuss two of them.

Scenario

I assume following scenario, which I think is most common:

  • we have an application we know to most likely stay with symfony2 as framework
  • we are likely to have a lot of business logic
  • the application uses a database and therefore entities and repositories

The "one bundle to rule them all" approach

This is my preferred approach ;-) We're using exactly one bundle, no vendor namespace - I love calling it the "CoreBundle" - and place a second folder into the <project>src/ folder, called "Core" or whatever you want to name your business logic container. Within this container, I usually place every part of my business logic I do not get as an external library via packagist. To access this stuff, I create service definitions within the CoreBundle. Also I tend to place all controllers within the bundle, since they are also tightly coupled to the framework.

Structure:

project_root
 |- app
 |  |- ...
 |
 |- src
 |  |- CoreBundle
 |  |- Core
 |    |- subnamespace1
 |    |- subnamespace2
 |    |- ...
 | 
 ...

In my opinion this looks quite concise and also I love the extra part of the namespace, containing all my business logic (Core).

Benefits:

  • The overall structuring directory-wise is easy to grasp for people working with sf2
  • No changes concerning class-loading is needed, everything works out of the box

The "no bundle" approach

Toni (@havvg) is really fond of getting rid of all subfolders and bundles, so he suggests to pack all code flat into the src/ directory. You can find a sample project skeleton on GitHub.

Structure:

project_root
 |- app
 |  |- ...
 |
 |- src
 |  |- Controller
 |  |- Entity
 |  |- Repository
 |  |- ....
 | 
 ...

As you can see, this flattens the project structure in an immense way and also allows for nice DDD. Downside is: You have to overwrite e.g. the ApplicationKernel to be able to find Controllers and locate all the other stuff. For the people asking themselves "where are the templates and other files?!": They need to be placed in app/Resources/.

What does this mean for my OSS bundles?

I know, there are loads of bundles around in symfony-world. And lots of them have business logic within their directory structure. If you want to de-clutter this mess, try following approach: Pack all the relevant business-logic into a framework-decoupled package (library!), promote it on packagist and then write a bundle to make the library easily accessible via frameworks like symfony. Read this article for a nice approach to make your bundle framework-interoperable!

More to read on this topic:

Tags: php, symfony2, software architecture