Wednesday, January 29, 2014

Optimization Project Update

E5: Speed Optimizations


  1. --Pages are nearly twice as fast, and the ‘feel’ of the site is 3X where it was. 
  2. --If you notice certain pages not updating, what comes next? Just bust-a-cache. http://doxxx.com/editors/sass/edit#cache

The Long Story

Dev has spent around 75% of the past two weeks focusing on our Episode 5 project for speeding up our new platform, and it’s looking like we’ve gotten some fantastic results to report. Before we started the project, we looked through our ‘Fancy Tool’ (New Relic reports if you’re interested) and pulled out some stats to see how we were performing pre-project. Over the course of about a week prior to starting, here’s what it looked like:
  • *Average Backend Server Time (over the entire site): 990 milliseconds 
  • *Average Load Time (over the entire site) in the browser: 8.18 seconds 
Next step was to identify the easy targets for optimization, again using New Relic drill down on the backend and Chrome’s DevTools on the front end. After two weeks of tweaking, fixing and working in between SX 2014 projects and several other bug fixes, we’re super excited with the results. Over a 6 hour time period today, our stats look pretty sweet:
  • *Average Backend Server Time (over the entire site): 299 milliseconds 
  • *Average Load Time (over the entire site) in the browser: 4.51 seconds 
That’s about a 66% increase in speed on the backend, and a 50% increase on the front end.

Here are some things we implemented over the course of the project and a few more details in case you’re curious about what’s changed:

Caching Layers

We've implemented a multi-layer cache setup, and we plan to grow and refine it as we proceed to the moon. Read on for what these layers are.

The first layer we put in is a '304 Not Modified' layer. This is a high level cache for our repeat visitors. Essentially, when you hit a page on the site more than once, if this cache is working, your browser assumes that nothing has changed, and lets our app servers rest. This is implemented in tons of places throughout the site, and is easily broken by doing a 'hard refresh' (shift+refresh on windows, shift+cmd+r on mac).
This cache is set to expire on two conditions:
  1. --If the 'page in question' is modified - say it's an event and the event is edited and saved, or it's a venue page and it's edited and saved.
  2. --Every evening at switchover time (4am your time)
This means that, for example, if you hit a venue page and look at it, then you add events to that venue via admin, then you reload the venue page, you will still get the cached page. Because you have seen the page recently, you would have to 'hard refresh' the page to see the new events. Note: We will be adjusting these caches to bust when events are added / removed, and / or adjust the cache times.
Every person coming to the page for the first time since the cache refreshed (on venue update or at the evening switch) will see whatever they were meant to see.

This is a simple high level cache layer meant to speed up the site for our repeat users.

Layer 2 is a bit more involved. This is a lower level cache on things like the header, footer, layout, specific event data, etc.
These caches are cleared automatically when a related item is edited in Admin. For example, if you add an “Update” in Admin, the header and footer cache will automatically clear, ensuring the latest posts gets out instantly.

If for whatever reason these caches are giving you grief, please try the following:
  1. --'Hard' refresh the page (shift and refresh on windows).
  2. --Check out the brand new tab in your site's SASS editor called 'Cache Busters'. Here you can click a button and bust the cache for various parts of the site.
  3. --Email us!
We're currently adjusting the timings on both layers of caching, and we will be adding more and more cache busters to the editor. Since we launched our new cache layer, site performance has done a dramatic and exciting increase, so we're hoping we can find a balance between caching data and having things be fresh where they need to be.

Optimizing Specific Heavily Used Calls

According to our stats, the Featured Events, Featured Venues, and User pages (and a few others) were all being hit heavily and not performing well. We spent a few days tweaking the way these work and increasing their speed dramatically. You might have noticed last week when we pushed out the Featured Event and Featured Venue changes, everything sped up! Not only did we add cache layers (bustable via the above instructions), but we trimmed the results down by removing some of the unneeded JSON, and we made it so that your Featured Venue widget only loads one venue, when it’s needed (previously, we were loading all Featured Venues with a single call even though your user only sees one at a time).

User pages were a bit harder to fix, but some SQL (database) optimization, coupled with some caching has us down to an average response time of 500ms compared to 1500 - 2000ms two weeks ago.

Updating the Load Order of the Javascript

In addition to removing and optimizing our Javascript, we tweaked the load order of all the page bits for what we think is best for user experience. Our new design relies heavily on asynchronous data loading - the core of the page loads and renders as fast as possible, then other data is loaded ‘after’ the page loads, which ensures our users see what they want to see fast, and the whole site feels more responsive. Even though our overall page load averages 4.5 seconds, users will see the page much sooner!

First priority is to display the page, then the ‘current user’ data is run and applied, and finally the brand ads / featured events / and featured venues are loaded. This means people can start viewing and interacting with the site much faster than the 4.5 seconds, while all the extra data loads faster and in the background.

Optimizing Images and Ensuring Multiple CDN Loads

This is something we did on the old site but hadn’t implemented on the new site. When a browser hits a page, there is a setting - unique to each browser - that determines the amount of things it will load from a specific domain name at the same time.

Lets say a page on the site has 30 images, and they’re all linked via the same domain. Your browser will grab them 5 at a time, which means it asks for 5 images, waits till they load, then asks for 5 more and so on until they finish.

What we did was implement code that breaks the image domains (we’re using Cloudinary for our images) into a randomish set of 5 domains. This means that when the browser sees 30 images, they’re spread over 5 domains, and it can load 25 at the same time instead of 5. Faster page completion times is the result.

Blocking Bad Bots

We love Google and Bing, and we’re OK with the old Yahoo bot, but beyond that we determined that other bots provide little value, and take up too much time and bandwidth. We’ve implemented a bot whitelist for the folks we like, and we’re telling the rest to leave us alone. This keeps some of the questionable bots from scraping page after page of the site, eating up our server and bandwidth performance.

In Summation

We are in a fantastic place, speed-wise, after this project. We will continue to optimize here and there, but as for our Official Project for the Episode, I think we can declare it a success, close it out, and move on to the Next Big Thing.






Wednesday, January 29, 2014

Optimization Project Update

E5: Speed Optimizations


  1. --Pages are nearly twice as fast, and the ‘feel’ of the site is 3X where it was. 
  2. --If you notice certain pages not updating, what comes next? Just bust-a-cache. http://doxxx.com/editors/sass/edit#cache

The Long Story

Dev has spent around 75% of the past two weeks focusing on our Episode 5 project for speeding up our new platform, and it’s looking like we’ve gotten some fantastic results to report. Before we started the project, we looked through our ‘Fancy Tool’ (New Relic reports if you’re interested) and pulled out some stats to see how we were performing pre-project. Over the course of about a week prior to starting, here’s what it looked like:
  • *Average Backend Server Time (over the entire site): 990 milliseconds 
  • *Average Load Time (over the entire site) in the browser: 8.18 seconds 
Next step was to identify the easy targets for optimization, again using New Relic drill down on the backend and Chrome’s DevTools on the front end. After two weeks of tweaking, fixing and working in between SX 2014 projects and several other bug fixes, we’re super excited with the results. Over a 6 hour time period today, our stats look pretty sweet:
  • *Average Backend Server Time (over the entire site): 299 milliseconds 
  • *Average Load Time (over the entire site) in the browser: 4.51 seconds 
That’s about a 66% increase in speed on the backend, and a 50% increase on the front end.

Here are some things we implemented over the course of the project and a few more details in case you’re curious about what’s changed:

Caching Layers

We've implemented a multi-layer cache setup, and we plan to grow and refine it as we proceed to the moon. Read on for what these layers are.

The first layer we put in is a '304 Not Modified' layer. This is a high level cache for our repeat visitors. Essentially, when you hit a page on the site more than once, if this cache is working, your browser assumes that nothing has changed, and lets our app servers rest. This is implemented in tons of places throughout the site, and is easily broken by doing a 'hard refresh' (shift+refresh on windows, shift+cmd+r on mac).
This cache is set to expire on two conditions:
  1. --If the 'page in question' is modified - say it's an event and the event is edited and saved, or it's a venue page and it's edited and saved.
  2. --Every evening at switchover time (4am your time)
This means that, for example, if you hit a venue page and look at it, then you add events to that venue via admin, then you reload the venue page, you will still get the cached page. Because you have seen the page recently, you would have to 'hard refresh' the page to see the new events. Note: We will be adjusting these caches to bust when events are added / removed, and / or adjust the cache times.
Every person coming to the page for the first time since the cache refreshed (on venue update or at the evening switch) will see whatever they were meant to see.

This is a simple high level cache layer meant to speed up the site for our repeat users.

Layer 2 is a bit more involved. This is a lower level cache on things like the header, footer, layout, specific event data, etc.
These caches are cleared automatically when a related item is edited in Admin. For example, if you add an “Update” in Admin, the header and footer cache will automatically clear, ensuring the latest posts gets out instantly.

If for whatever reason these caches are giving you grief, please try the following:
  1. --'Hard' refresh the page (shift and refresh on windows).
  2. --Check out the brand new tab in your site's SASS editor called 'Cache Busters'. Here you can click a button and bust the cache for various parts of the site.
  3. --Email us!
We're currently adjusting the timings on both layers of caching, and we will be adding more and more cache busters to the editor. Since we launched our new cache layer, site performance has done a dramatic and exciting increase, so we're hoping we can find a balance between caching data and having things be fresh where they need to be.

Optimizing Specific Heavily Used Calls

According to our stats, the Featured Events, Featured Venues, and User pages (and a few others) were all being hit heavily and not performing well. We spent a few days tweaking the way these work and increasing their speed dramatically. You might have noticed last week when we pushed out the Featured Event and Featured Venue changes, everything sped up! Not only did we add cache layers (bustable via the above instructions), but we trimmed the results down by removing some of the unneeded JSON, and we made it so that your Featured Venue widget only loads one venue, when it’s needed (previously, we were loading all Featured Venues with a single call even though your user only sees one at a time).

User pages were a bit harder to fix, but some SQL (database) optimization, coupled with some caching has us down to an average response time of 500ms compared to 1500 - 2000ms two weeks ago.

Updating the Load Order of the Javascript

In addition to removing and optimizing our Javascript, we tweaked the load order of all the page bits for what we think is best for user experience. Our new design relies heavily on asynchronous data loading - the core of the page loads and renders as fast as possible, then other data is loaded ‘after’ the page loads, which ensures our users see what they want to see fast, and the whole site feels more responsive. Even though our overall page load averages 4.5 seconds, users will see the page much sooner!

First priority is to display the page, then the ‘current user’ data is run and applied, and finally the brand ads / featured events / and featured venues are loaded. This means people can start viewing and interacting with the site much faster than the 4.5 seconds, while all the extra data loads faster and in the background.

Optimizing Images and Ensuring Multiple CDN Loads

This is something we did on the old site but hadn’t implemented on the new site. When a browser hits a page, there is a setting - unique to each browser - that determines the amount of things it will load from a specific domain name at the same time.

Lets say a page on the site has 30 images, and they’re all linked via the same domain. Your browser will grab them 5 at a time, which means it asks for 5 images, waits till they load, then asks for 5 more and so on until they finish.

What we did was implement code that breaks the image domains (we’re using Cloudinary for our images) into a randomish set of 5 domains. This means that when the browser sees 30 images, they’re spread over 5 domains, and it can load 25 at the same time instead of 5. Faster page completion times is the result.

Blocking Bad Bots

We love Google and Bing, and we’re OK with the old Yahoo bot, but beyond that we determined that other bots provide little value, and take up too much time and bandwidth. We’ve implemented a bot whitelist for the folks we like, and we’re telling the rest to leave us alone. This keeps some of the questionable bots from scraping page after page of the site, eating up our server and bandwidth performance.

In Summation

We are in a fantastic place, speed-wise, after this project. We will continue to optimize here and there, but as for our Official Project for the Episode, I think we can declare it a success, close it out, and move on to the Next Big Thing.