Jump forward 2 months and we found out we were accepted, followed by 2 months of planning and cramming 18 months of project work into a 30 minute presentation. Before you know it we were flying out to Vienna from Sydney Australia for a quick 3 day trip at WordCamp Europe - the largest WordCamp event in the world.
Below is the talk we presented and transcript on how we scaled WordPress to support over 3 million posts across 22 sites, with over 16 million users a day.
Transcript
00:00
18 months ago, News Corp Australia, started its journey migrating all of its websites over to WordPres. Both Juan here and myself were on the software engineering team part of that migration project. We had often heard at the start of this project, WordPress doesn't scale. Well our talk today is to prove them wrong, and to demonstrate that WordPress provides a really good base. It's a really solid platform, and if you implement the correct technical solutions to extend WordPress, it can scale really well for large and high volume traffic websites like ours.00:48
Historically people may have interpreted News Corp Australia to be a little like this, a 90 year old company, potentially focusing a lot on their newspaper side of the business. But over the past 18 months, a few of the people you can see in this photo plus a lot more in News Corp, migrated 22 of Australia largest sites including news.com.au, theaustralian.com.au & foxsports.com.au over to WordPress VIP. We've imported over 3 millions posts, and are receiving around 16 million users a day.01:26
Why did we move to WordPress VIP? First of all, what is WordPress VIP? WordPress VIP is a hosting option for WordPress sites that is powered by wordpress.com which is an enterprise level hosting. We needed to escape our old CMS, which was unscalable and very expensive to maintain. WordPress was chosen because it is a very reliable tool, and most of our journalists are already familiar with, making training and up skilling across thousands of journalists much easier. We just needed to find a way to scale WordPress to meet the needs of our business.02:06
Lastly and possibly the best part, with WordPress VIP we didn't need to worry about this. wordpress.com manages our production infrastructure, so updates to the core get directly deployed, we don't need to worry about updating core, or server patches. Our main focus is developing the best new features for our customers.At a high level we have developed 45 WordPress plugins and 22 themes. apart from our production infrastructure, we run 58 AWS non production environments, which are a combination of EC2, RDS and SQS, where we try to mimic as much as technically possible the Wordpress.com environments. From a deployment perspective, we run 25 deployments to production each week through our CI/CD platform; and we love our unit tests, we have over 5000 and counting.
03:21
Our journey started in November 2014, when we look back at those 18 months, there are 5 key factors that really challenged us. We wish we had of known about these before we started the project. however, encountering these challenges and discovering their solutions were part of our journey, and actually a great team building exercise. I'm not going to lie though, it was a tough tough journey.We wanted to share our story for 2 main reasons. First, to demonstrate that WordPress can scale for the enterprise, and secondly, to share the challenges we faced and the solutions we implemented.
4:03
We are going to focus on 5 key areas; site build; front-end; content ingestion: authoring - how we are creating and updating new stories; as well as continuous integration and deployments. We are going to talk about the challenges we faced and the solutions we implemented.4:24
The first talking point, side build. We really needed the most simplistic way for our WordPress admins to manage and organise the content on our sites. Customizer was a great start to this, it provided a really good base, but we needed to extend it, and this is the functionality we are going to take you through. Working really closely with the XWP team who I can see a lot of in the front here. we added a lot of functionality into customizer, a fair bit of which has been moved into WordPress core. Here is a short video demonstrating just some of the few key features that have been added in or may already now exist in WordPress core.05:02
All of our sites utilize multiple content areas - that is pretty standard stuff. Within each of our content areas, we have this concept of default widgets, or also known as global widgets, which will render on every single page. We are currently on the business page, and we are looking at the right hand rail, it's just a list of stories. If we now go to the news section in the navigation, you will notice that the widgets we just saw and actually the same widgets that were just rendered. So we can spin up a lot of pages that have very similar structure very quickly. Then to extend that, there is a plugin which we developed which allows you to override the widgets on a specific page and content area. We are on the homepage now, and by checking the 'localise to the current page' checkbox, it allows your to add a completely different set of new widgets for a specific content area for that specific page.05:54
We call these contextual settings and they are stored as a JSON data structure as a custom post type in WordPress. The next thing you are going to see is our homepage, and that is the next big challenge we had. We have a lot of widgets that keep scrolling, and scrolling, and it's maybe a bit too much content, and that's what caused us a few challenges. By default WordPress as I'm sure a few of us know, stores all of its widget data in sidebars which is then stored in WordPress options. Because we have a huge huge homepage, it meant that the amount of data we were storing in WordPress options was more than 1mb. We leverage Memcache quite extensively, and 1mb is unfortunately the limit in Memcache, so as soon as any key has more than 1mb of data, it's not longer getting cached. This meant that for every single page request, we weren't caching WordPress options, which was a huge performance issue for us.Another plugin was developed, we called this widgets-plus. Essentially it's a new custom post type and it stores all of the widget data now in its own custom post type in WordPress. This means that we can store everything relevant to just a single widget in isolation, and cache it in isolation, we don't need to worry about caching everything as part of WordPress options. Within our previous CMS it took us hours, sometimes a day to build a relatively simple page. within WordPress, we only showed you a very small part of it, but we can build really complex pages in literally a few minutes. It's a huge efficiency boost for our business.
07:24
After a few months we had a solution for site build, now the front-end architecture was really key for us. We needed to make sure when all of the sites were moving across into WordPress, we weren't going to end up with a mess of spaghetti front-end code.07:38
I'm sure all of you here like us who are developers (front-end or backend) know that PHP as a templating language, isn't necessarily the best approach. It gets really messy really quickly. We needed a templaitng engine to ensure we decoupled our front-end code from our backend. Twig was our templating engine of choice, and there are 2 plugins that power this implementation. The first one is called vip-twig, and this one is responsible for integrating Twig into WordPress core - it is a very standalone plugin. Our next plugin, template-integration is responsible for exposing all of those core WordPress functions into our twig templates. It basically bridges the gap between WordPress core and the data our templates need to render out posts on the front-end. The good part about this is if we need to implement a new templating language because twig is no longer the laungauge to use, we can keep our template-integration plugin as it's all reusable, and we can simply build a new vip-smarty or vip-ejs plugin to manage that integration for that new templating engine.08:40
We have now built our sites, and we have a templating engine in place, now we need content. We have been working on content ingestion functionality for around 6 months in parallel, and in May 2015, we were moving into beta testing in a VIP production environment, planning on launching our first site in June 2015. As you can see from our slides, the timeline is no where near the far right of the screen - yes we ran into some problems.09:14
In our first attempt we ran into 2 major problems. First our posts are authored in a different platform, they are not authored in WordPress, and we had a benchmark of 60 seconds that content needed to be published into the WordPress front-end from the moment it was saved. Secondly we were seeing some rare race conditions in content ingestion, which caused content to be duplicated on our WordPress sites.As well as the above 2 major problems, when we moved into production we needed to do an initial content import of around 300,000 posts per site. We didn't have MySQL access as our sites are hosted on wordpress.com - we couldn't expect them to give us access. It was a challenge to import that amount of information. Lastly, after that we, would have a constant stream of data updates including news stories, videos, photos coming into our sites at a rate of around 16 updates per second.
10:28
We needed an architecture to support this. Here is a simplified diagram (it is a very simplified diagram) of what we have implemented. It can be read from bottom to top. First our journalist will create a story or update it in one of our editorial tools. This post will then be sent to our API platform which manages all of our content. It will organise it, categorise it, and make it available to other platforms to consume it, WordPress being one of those platforms. On top of that, as soon as a message arrives into the API platform, it will send a notification to our SQS system in WordPress informing it of the new story and to 'start' ingesting.11:18
With the help of the WordPress VIP team, we developed an ingestor daemon, we call it Turbo, because in the beginning it was built using WordPress crons, but wp-crons were not meant to work in that way. Like I said before, we were doing 16 updates a second and wp-cron were not coping with that, they were not fast enough. We needed to create this multi-threaded daemon. This whole process that you are seeing is what we call 'end to end publishing'. With this architecture in place, we have a single entry point for content imports, whether we are doing 300,000 updates in an initial import, or 16 updates a second. We also have with SQS, reduced the amount of duplicate content, actually it is now at zero, and finally we can ingest content in less than 60 seconds, we are basically ingesting content in less than 10 seconds, as soon as a journalist saves a story, it appears on our site within about 6 seconds.12:38
Another challenge that we faced was the amount of fields and data we had that represented a 'NewsStory', or a 'WordPress post in our business. This is stored in our APIs, this is the object we get. As you can see there is a trimmed down version of how our JSON object looks, it's massive, it's a really big object. At the beginning we thought lets split this into pieces, save everything into post meta fields, and build them up again to send writes back to our APIs. However, we thought, why re-invent the wheel, if we are already using this JSON object across all our other platforms, let's just save it in our WordPress database, and consume it from there, and read it from there. Due to this we decided to store this JSON object in the post_content field in the posts table. With this architecture, we needed to change how our templates work, and how our plugins work with this field. We are no longer reading it as text field, but as a JSON object.Saving the JSON object here allowed us to have revisions of an entire post object. Also when rendering a post on the page, we can do it with a single database call. We finally kept a simple rule, we would only use post_meta for searchable parameters.
All this gave us a huge performance benefits by more than halving the amount of database calls.
14:19
As you can probably guess, June came and past and we didn't launch our sites. But they stayed in beta testing for a few months. Our internal development team, the XWP team, and WordPress VIP team, worked for around 2 months to implement solutions to all the problems we just discussed; and one night in September, a very very late night, we launched our first site on WordPress VIP. It was a night filled with anxiety, way too much pizza, and then a sigh of relief when it switched on and everything just worked.By now we have implemented a simplistic site build approach, our front-end is clean, and we have content quickly ingesting into our WordPress sites.
15:21
Now we needed to use WordPress for what it is meant to be used for; for editing and creating content. We developed these authoring screens within WordPress that expose our data JSON structure that defines our News story content so people can edit it in the most simplistic way. As you can see we leveraged meta boxes quite extensively, and used CSS and Javascript to provide the look and feel we were after. We tried not to deviate from WordPress core at all so any updates that come through to WordPress.com did not affect us.15:58
Here is another image of one of our authoring screens to demonstrate what we are doing for image management within posts. We are leveraging off the media element in WordPress and allowing authors to edit/crop images within the UI.16:15
Our category management is also a little different, our stories have to have in the permalink, the URL of the specific category they belong to. So we needed to add in a primary category (or section as we call it), that they have to select when creating a story. Then you can put that story in multiple categories if needed.16:48
Within the first week of February in 2016, we launched our last 4 sites into wordpress.com in just one week. And everything just rolled out very very smoothly. Then gradually over the next 4 months, until pretty much last week, all that authoring functionality that Juan just showed you, rolled out into a production environment.17:08
So we needed to maintain this complex WordPress platform that we had setup. We also needed to ensure that at all costs it was impossible for any cowboy coders to rise up and try and take control of our codebase.For this it was really critical that we had a solid continuous integration and deployment platform setup. We needed to make sure that a developer can get his code or her code from their local environment all the way into production in the most stable and risk free way.
17:31
For us it comes down to feature based branching and deployments. Every feature for sits within it's own feature branch, branched off master in GIT. It's always developed in isolation, it's always tested in isolation, it's always deployed in isolation. Our team chose to never bundle up more than one feature and deploy it at once, as for us this added too many risks and too many dependencies.Our team then utilizes pull requests to control any code merges into master. This is where code reviews happen, which is one of our favourite pastimes. Every single line of code is code reviewed by another developer. No single line of code reaches production without being code reviewed by at least 1, 2 or sometimes 3 other developers in the team.
18:14
We then utilize Atlassians Bamboo continuous integration platform to manage all of our automation. This is where on every code commit we run PHPCS to make sure all of our coding standards are being met, we run PHPUnit to make sure all of our unit tests are passing and there are no regression issues we are aware of. Once all of those tests pass we package up the application, running any front-end build tasks in gulp or grunt, we compile our TWIG templates, then we deploy that out to our 50+ non-production environments.This is when testing happens, either manual regression testing, or we have a suite of selenium based automation tests using the robot framework. Once all manual and automated tests have passed we deploy this feature through our continuous integration platform all the way into the WordPress VIP environment. We celebrate with a few beers, and then we start that process all over again for the next feature.
The best part is, this isn't the process we started with 18 months ago. This evolved over countless retrospectives, team huddles, and big issues with our previous processes, to what it is today.
19:48
With a solid continuous integration and deployment platform setup, the only other challenge we faced was with WordPress VIP you cannot control the exact time your code is deployed to production, and if your feature has changes in 3, 4 or 5 different code repositories, you can't guarantee the order the code in those repositories are always deployed in.19:32
We needed a way to be able to turn features on and off in a production environment. Welcome to the feature toggle, This was developed by one of our developers called Roman back in Australia, and it is an implementation of the feature toggle technique by Martin Fowler. It abstracts all of the complexities around feature toggling. So with just a few lines of code you can setup a new feature, you can give it a nice descriptive name, and in the last parameter (the boolean) you can decide if it is active or deactivated by default.20:00
Then all of our themes and plugins have access to check if this feature is enabled or not. If it is enabled you can execute a certain chunk of code. Lastly this plugin exposes an admin page in WordPress, that lists all of the features we currently have available to us, it's de-active and activated state, and this is where you can toggle a certain feature on or off. This feature toggle has allowed us to, for our really big features, to roll them out to all of our sites from a deployment perspective, and then gradually toggle them on one site at a time so we can stagger the regression testing required.20:32
So we have come to the end, and what have we really learnt? First of all, as Juan pointed out, migration projects are hard hard work. No matter matter how much planning you do, you are always going to run into unknowns. For us it was really critical for us to get our code into a beta version on the WordPress VIP infrastructure, so we could test our end to end solutions in their environment, rather than our non-production setup.To all those people that said to us that WordPress isn't going to scale for what we are doing. Well if you just download WordPress core, install it on a single server an try and support 16 million uses, then well no it isn't going to work. But if you implement the correct solution architecture, with things like memcache, elastic search, asynchronous content ingestion, it scales really well for sites like ours.
21:21
Lastly, us partnering with two really awesome teams. With XWP who really put a lot of effort in at the start of our project, their expertise in WordPress was absolutely priceless for us. They were very pivotal in our project, especially in the first 12 months, and a lot of us have become really good mates.Secondly, with the WordPress VIP team, they just kept going and going just like the energizer bunny, they never stopped no matter what challenges and issues we threw their way.
0 comments:
Post a Comment