Find answers, ask questions, and connect with our
community around the world.

Activity Forums Web Design Caching for Noobs: A Real-World Example of Strategic Caching in ASP.NET MVC. A Breezy Overview of how Karma.fm Approaches Performance Tuning

  • Caching for Noobs: A Real-World Example of Strategic Caching in ASP.NET MVC. A Breezy Overview of how Karma.fm Approaches Performance Tuning

    updated 3 weeks, 3 days ago 0 Member · 1 Post
  • Maverick

    Member
    October 27, 2019 at 4:48 pm

    Overview The purpose of this post is to show non-technical readers how rewarding it can be to solve technical problems in a real-world situation. Background When you load a Karma webpage, your browser sends a signal to the Karma App Server that says, “Hey App Server! Build me this page and send it back so I can display it to the user!” Karma’s App Server then sends a signal to the Karma database that says, “Hey, database, give me all of the data that this browser is asking for!” The Database Server then queries something that looks like an Excel spreadsheet to fetch the information that you’re requesting. Here’s a screenshot of some of Karma’s data: ​ Processing img nwd28mctl5v31… The database collects the data and then responds to the App Server with a signal that ideally says “Here’s the raw data, App Server!” The App Server then takes the plain-old data and materializes it into a meaningful structure, sending the information that the browser needs in order to show you what you’re looking for. You can see this data on Chrome if you open up Developer Tools (F12 on Windows), navigate to the Network tab, and refresh. You’ll be able to observe the conversation that the browser has with Karma’s App Server as you use the application: ​ Processing img v0hahydul5v31… When the browser receives this information from the App Server, it then renders it into the interface that you’re reading right now. Here’s what it looks like underneath the text you’re reading: ​ Processing img r3csqgdvl5v31… You can view this from Chrome’s Developer Tools, too. It’s under the Elements tab. The request to display Karma’s content looks kind of like this: ​ Processing img oo1owxvwl5v31… Each of these numbers represents a potential performance bottleneck. Performance Strategy Technical readers: Karma sits on strategically denormalized SQL, though the thinking that follows applies to NoSql as well. As I develop Karma, my priority is agility. I don’t know what I’m building or how it will evolve; I want to remain nimble and responsive to member feedback. As a developer, I have the ability to maximize productivity while sacrificing some other things. Performance is one of those things that I can sacrifice in order to remain highly productive. For example: I’m relying on a class of tools called ORMs – Object Relational Mappers. These tools allow developers to ask the database for data without having to speak the database language. Let’s look at the Tribes page. One option for step 2 in the above image is to write the following code in a database language that requires database thinking: ​ Processing img 5ql609kyl5v31… Another option is to to write the following code using an ORM that requires application thinking: ​ Processing img qec0w37zl5v31… The second option is more human-readable and debuggable which keeps me moving quickly. By abstracting the database thinking into application thinking, I can code in a language that more closely resembles plain English and depend on the ORM to generate the database language on my behalf. This abstraction allows me to trade performance for productivity; abstraction does tend to slow things down when you’re dealing with complex scenarios. Sometimes the database code that the ORM generates is less efficient than database code that a human can generate. When a particular Karma feature is slow, it’s usually because of a Step 2 bottleneck. There are many ways to address Step 2 bottlenecks. My general strategy is to rank the potential solutions by ROI and work my way up the list, targeting the solutions that require low effort and yield significant improvements, all while attempting to remain developer-friendly. For technical readers, this order of operations typically looks like this: ​ Index appropriately so I don’t have to change any application code Write more efficient ORM queries in application code Implement property caching to minimize impact on application code Rearchitect the database model to favor high-performance ORM queries Use a Micro-ORM that allows me to remain in application code while speaking database language Write stored procedures that require me to write in database code This is moving the needle rightwards on this spectrum: ​ Processing img r6sn99o0m5v31… Web application developers have to continuously calibrate along this scale as much as other software developers. Generally speaking, the more natural something feels, the more we’re outsourcing the complexity to the computer, and that comes at a cost. As our developer tools improve, we can increasingly depend on natural, nimble methods. I’d like to think that in the future, our developer tools will be so natural and efficient that we won’t ever need to write a line of application code. Until then, we have to strategically and surgically crawl our bottlenecks back towards more precise machine thinking. Karma’s Current State Karma is mostly on the far left side of the above spectrum right now. The vast majority of traffic is serving read-only content from the site (like this post), so the majority of my performance-enhancing efforts have been directed towards read-only queries, like the Karma homepage. This is why Commenting and Publishing are still a bit slow. Let’s see how we’re doing with a Load Test of the karma.fm production homepage! Pro-tip: Don’t do this at home. I’m load-testing a production application endpoint right now, which means that current users may experience performance degradation. The professional approach is to create a separate environment dedicated to load-testing, however I’m working under time and cost constraints. ​ Processing img 61d008j1m5v31… This tells us that with a simulated load of 20 simultaneous users spanning three minutes, the average response time is half a second with a 100% success rate. This load test approximates the upper bounds of the current traffic profile of Karma. But… let’s assume we get a spike of traffic and see what happens with a 5x multiplier on the User Load dimension over 3 minutes: ​ Processing img 0al7hq92m5v31… The average response time more than tripled. Yikes. Half a second may seem decent, and 1.5 seconds may seem sort of acceptable… though 50 milliseconds is the golden standard of .NET web application development. That’s 10x faster than our fast case! This standard is set by the StackOverflow team, who build under an engineering culture that says “performance is a feature”. I fully endorse and admire this culture, but I also have to balance my performance aspirations against the fact that Karma is being developed by a single person on nights and weekends with a shoestring budget, so I need to try to milk as much performance as possible out of the tools I have, while remaining agile, while keeping costs low, while keeping complexity low. So, what’s the highest ROI solution here? These load-tests are testing the full pipeline: ​ Processing img gem5zlg3m5v31… Every request that gets sent is triggering steps one through four on the homepage, which means that we’re asking the database to give us 30 posts, 70x per second. This is nothing compared to the traffic that mainstream social media websites have to contend with, but it’s a lot for Karma’s little machines that I want to avoid upgrading for as long as possible. I’ve tuned this particular endpoint with Dapper (a Micro-ORM created by the StackOverflow team). This means the homepage is pretty far-right on the human-machine spectrum: ​ Processing img 78imym64m5v31… How can we painlessly improve the performance? Free caching to the rescue! Naive Caching Caching allows the App Server to skip right over steps 2 and 3 by telling the App Server, “Hey App Server! Once you fetch the data from the Database Server, save it in your memory so that you can reuse it the next time someone asks for that same data!”: ​ Processing img bk8dk065m5v31… This is super effective when the content you’re loading doesn’t change much. We can test this with a single line of code: ​ Processing img nyubsux5m5v31… This line of code does what we just described, and more. It tells the App Server to store the results in its memory for 20 seconds. It also advises the client (your browser) to store the results on the client memory for 20 seconds. The client doesn’t have to do this, but it may, and most modern browsers probably will. If your browser honors this advice, then repeatedly clicking the “refresh” button on your browser over a span of 20 seconds will only send a single request to the App Server. That’s a massive improvement, as it eliminates repeated requests (steps 1 and 2 above) for the same resource. Even if your browser doesn’t honor the advice, the App Server will fetch the data from the database once every 20 seconds and then return the same results to any client that requests it. This combination minimizes requests sent to the Database Server from the App Server, and it minimizes requests sent to the App Server from the browser. Imagine that you have amnesia. The decision to Cache instead of improving the query performance is sort of like asking someone what their name is and then writing it down on a piece of paper that eventually evaporates, instead of asking them over and over again while trying to improve your memory. Let’s see what kind of impact this has on our load tests: Note to technical users: I’m using a staging slot on the same app instance for this, which is still dumb, but not as dumb as testing this in the production slot… right? ​ Processing img kdc2ds47m5v31… Wow!! We’ve achieved the golden standard at a small scale! A 10x improvement with a single line of code! Let’s see how our slow scenario performs now: ​ Processing img adz06ou7m5v31… Another 10x improvement! Talk about high ROI. So, we can just slap this single-liner on ALL THE THINGS and call it a day, right? ​ Processing img tmhbn7j9m5v31… Not so fast. Thoughtful Caching The previous section was titled Naive Caching because it neglects some important details. We have to think about what’s happening under the hood, and what the user experience looks like. The Karma.fm homepage shows you a list of posts, like this: ​ Processing img rqx0d7pam5v31… When we look at the data flow diagram again, we might notice a problem: ​ Processing img m1wfbnkbm5v31… When I request the karma.fm homepage while logged in, I expect to receive a page with header information that contains data unique to my account: my username, notification indicator, and stats. If this is being cached by the App Server, then that means other users who visit the homepage will then receive the same header information until and unless their request triggers a database query that fetches their personal information! This is a problem. Back to the amnesia metaphor, what we’re doing now is like calling everyone by the name of whoever we asked yesterday. A high-ROI solution is to simply use a different cache per user. If an anonymous user visits Karma, they get the anonymous-user cache. If a registered user visits Karma, they get their own cache. Technical readers: This can be configured easily enough for .NET frameworks; searching for “output cache per user [target framework]” will lead you to the appropriate implementation. Note that this is a very easy solution, but that it’s not ideal. Suppose a user visits this post’s page and sees a notification. They click on the notification icon, and the icon reverts once they visit the notifications page, reflecting the fact that there are no longer any new notifications. But then upon revisiting this post’s page, they will see the notification icon light up again, because the header was cached along with the page content. The header is displaying stale data to the user. This can be addressed with a technique called doughnut caching, which adds another dimension to the caching mechanism: page subcomponent. With user + page subcomponent, we can tell the App Server to cache some things, while fetching others, while ensuring that users see what they ought to see. I’m not implementing this now, because I had a time budget of three hours to handle caching + documenting. For now, I’m pulling the cache time back to 2 seconds, since any performance-degrading stimulus is likely to come from a spike of anonymous traffic. Since “anonymous users” are treated like a single user from the cache’s perspective, this means that a 2-second per-user cache will give us a natural user experience while mitigating 99% of the risk that we’re facing with sudden spikes of anonymous traffic. Now we may cache all the things. – by hq overview eveFromKarmaFm – –

Reply to: Maverick
Your information:

Cancel
Original Post
0 of 0 posts June 2018
Now