April 8th, 2024 × #caching#webdev#performance
Cache Ruins Everything Around Me
Discussion about cache invalidation issues when caching user-specific data, solutions like using different URLs, partial caching, edge functions, and drawbacks like flash of unstyled content.
- Talking about how cache ruins everything
- Experiencing crazy cache issues on Syntax site
- Issue was intermittently popping up
- Wrong theme being served sometimes on refresh
- Solutions for caching dynamic content issues
- Using query params for different content versions
- Not caching page but cache parts like database query
- Having cache key variants for each theme
- Edge functions to modify cached content
- Flash of unstyled content as client-side solution
Experiencing crazy cache issues on Syntax site
Wes Bos
Hey.
Wes Bos
Not too much. I was in a crazy rabbit hole with this cash stuff yesterday, and I know you've you've gone down the rabbit hole a couple times as well. And we finally figured out, like, what's going on, but, like, there's, like, a like, the joke of, like, the hardest problem in computer science is, like, cache invalidation.
Wes Bos
And I believe it because there's so many places where you can cache things. And if you get the data in the cache, it's very, very frustrating. So we thought we would explain the problem and the possible solutions that you can hit with this type of stuff. Specifically, the stuff's gonna be around CDNs, CDN caching, not browser and not like a like a memory cache.
Wrong theme being served sometimes on refresh
Wes Bos
Yeah. So it it's really interesting because the way that the cache the CDN cache works is that we server render the entire page. And part of that page is it detects if you have a cookie for what your theme is in a set. And if if there is a cookie set, it renders it out, and it says theme dash level up, and then it makes it purple. Right? And this is this is possibly, like, not for us, but this can possibly be a security issue for people listening JS because when you put user specific data in a cache, there's a possibility that that user specific data is going to be served up to somebody else because we basically say, we visit a a single page on the Syntax website.
Wes Bos
The CDN says, oh, we don't have this page, so we better go render it. So it it just renders it for whoever is trying to visit.
Wes Bos
And if that person has a theme set, it's not very common because not a lot of people have a theme set. But if somebody goes to that specific page with a theme set, it renders out the HTML, caches that HTML with that user's settings, their theme, and then sticks it in the CDN. And then anybody else who then visits that page for the next, I don't know what it is, like, 10 minutes or so, will then get that cache HTML.
Wes Bos
And, again, that could be a possible issue if you're putting sensitive user data in there, but ours, luckily, is just, like, a cash thing. And I remember, like, this popped up, like, 6 months ago and I was like, I think it's because we're trying to cash. Like, you can't have your cake and eat it too. You can't have, user dynamically rendered and caching because we want to be able to to serve it up. Right? So I thought let's go through some of the, like, the possible solutions to this type of thing because, yeah, it can be really frustrating.
Solutions for caching dynamic content issues
Wes Bos
it doesn't pop up because we're not running a CDN cache locally. We're just rerendering every page reload or or every not even page roll, every single save. So, like, why else might this pop up? A lot of the times, you'll hit this where if you have AB testing, and it's gonna be cookie based. Right? So somebody visits a landing page for your website, and half the people get the best podcast ever, and then the the other half of the people get become a better web developer. So you want to cache the page, but you also want to serve different people the same content on the same URL. So AB testing, user selected features, themes is probably the biggest one.
Wes Bos
Geo based items, so based on their language or where they're coming from, do you wanna show euros or Canadian dollars? Do you wanna show, French or English? It it really depends. Images. I found that images are a really good use case for this because you visit the same URL for an image, scott.jpeg, and a lot of the times, things like Cloudinary and ImageX, they are going to vary the output based on the request. So if your browser supports if they can they can see the user agent coming in. And if that user agent supports Wes, they're gonna convert it to WebP and serve it up. Even though it has a Scott JPEG extension, they can still serve it up with the content type of web p. JSON or HTML, I've I've run to this many times. The I can has dad joke URL.
Wes Bos
The endpoint is literally just I can has dad Node Scott, and you visit it in the browser, and it gives you the HTML page. You ping it with a fetch request with the headers accept JSON, and it's gonna give you JSON. Same URL, different content output, but those things both of those things should be cached.
Wes Bos
JSON output and the HTML that's rendered out. Right? Different encodings. If it's something needs to be g zipped. Something doesn't need to be something doesn't support g zipped. You have to send it over without it. So this whole idea of sending different content via the same URL is called content negotiation.
Wes Bos
And I'll link up to the MDN docs on what that JS. And it could be a bit of a pain when you start running into caching at a CDN level because if you need to cache more than one thing, you want it to be dynamic, but not too dynamic.
Using query params for different content versions
Wes Bos
Yeah. It's it's way easier, especially for languages. It's like if you just put the language Node, e n dash c a or e n dash f r, French Canadian, you put that in the URL, then it's so much easier to cache that. It's so much easier just to redirect somebody to that specific page based on their, request a page, there's a header that says what your accept language is. And based on the language that's set in your browser, the server is able to just redirect you to different pages, and putting it in the URL bar is better. Also, you can just share URLs that have your language in it, so it makes it easier. So in the example of the JSON versus HTML, what most people do is they don't they don't use the same URL. They set a query param or they add Scott JSON to the end or they add something to the URL. So it's a just it's an entirely different URL. You don't have to cache it.
Wes Bos
So that's probably the the most ideal.
Wes Bos
And I asked on Twitter, like like, what are people's strategies for content negotiation? And I had just like I said, like, oh, if it's a theme or if it's a language. And most people are like, well, if it's a language, just use different URLs. But Yeah. Yeah. That doesn't work for themes because you're not gonna tell somebody to go to syntax dot f m Wes mark theme equals dark. People no one's gonna type that in. Right? You need to just be able to visit the URL directly.
Not caching page but cache parts like database query
Wes Bos
So the next solution that you have here is don't just don't cache the page.
Wes Bos
You can cache other parts of it. So if you think about, like, what makes let's take syntax out of them. What could possibly make that slow? Right. There's the initial the initial request to the server could be slow. The database query where you pull the latest shows and come back could be slow. The actual rendering, taking the data, rendering out via Svelte could be slow.
Wes Bos
So one of the biggest parts of that is, like, the the database query could be slow, and you can simply just take that out of the equation and say, okay. I'm not gonna cache the actual rendering of the page, but I am going to cache the database query so that whole round trip to the database and back with all the data is not needed. And Scott built this where you built this right. You stick the data in redis if it's more than, I don't know,
Wes Bos
Yeah. Other other things, lots of key value stores are really handy for this.
Wes Bos
Deno has key value store, which I used recently. Cloudflare has key value store.
Wes Bos
And those things are are really easy. I what I like about Redis is that you can set pnpm expires on it directly so you don't ever have to worry about deleting it or managing it. You'd say, expires after some amount of time.
Wes Bos
And Redis also has, like, cache tags as well, right, where you can you could tag it. Like, one kinda cool thing I really like about both 10 stack query and now Next JS is having it added is that you can tag your queries with something like show or show 125, and then you can say, alright, Let me invalidate all of the things that have a tag of show Wes, or let me invalidate everything that it has a tag of show. So you can get you can, like, add multiple tags to it from very broad to very specific.
Wes Bos
And that's really nice as well when you just want to say, alright, anything that has this tag, let me nuke it out.
Wes Bos
Oh, Brooklyn.
Wes Bos
Alright. Next 1 we have here is this idea of a cash key, which is kind of what I was looking for when we ran into this because we have, I don't know, maybe 6 different themes.
Having cache key variants for each theme
Wes Bos
And I was like, okay. Well, it's not a big deal to render out the home page for light Node, and then render it out for dark mode, and then render it out for the level up theme. And then you just you just have 3 versions of that cache or 3 variants of that cache. And then depending on the request that comes in, you serve them up one of those 3 versions. So that will significantly increase your cash size and significantly decrease your cash hits, the more options that you have.
Wes Bos
But it is often something that is is needed. And I looked into this, and it seems to be it's not a standard, so it has to be implemented at every single level of CDN.
Wes Bos
So there is Node standard called the vary header, but it doesn't seem very well supported.
Wes Bos
Specifically, Cloudflare and Vercel CDNs do not support this.
Edge functions to modify cached content
Wes Bos
And then it seems like, Netlify Netlify is a really good blog post about this. They rolled out their own header called the Netlify vary.
Wes Bos
And you can specifically say, alright.
Wes Bos
Here are the different variants. It must be based on things that are coming in to the request because the browser has to send them automatically when somebody visits a URL. So what are things that are automatically sent to the browser? Well, the cookies are, the accept language is you can often get the user's country code from your the CDN can get the country code and resolve that. So Netlify has this thing called the Netlify very header, which you can say, alright, based on device type or based on this specific cookie, which is what we want. We say based on the cookie of theme, this serve up one of these 6 different variants for this specific page.
Wes Bos
Fastly also has their own version of it, sort of like a hash for that specific page.
Wes Bos
Cloudflare does have cache keys, but they're enterprise only.
Wes Bos
And anytime like, it's not even like a it's not like a pro plan. It's not a business plan. It's enterprise, and, like, you gotta get on Yeah. Anything you gotta get on the sales. Yeah. Yeah. I don't want to go to sales. And, like, that, there's no chance that's gonna be cheap. Does anybody ever wanna talk to sales? I you know,
Wes Bos
So the next option is use an edge function. I think this is a really clear example of what a good use case for an edge function would be JS that, well, yeah, maybe you don't wanna make 6 different variations for every single page. Maybe you only wanna make 1, but you still want it to be dynamic. So an edge function is a function that sits in front of your actual server, and it will intercept the request coming in. And, Cloud for Work is a good example of this. So what you could do is you jump in the middle, you go and fetch the regular cached page, and then you parse out the HTML and change the bits that you want. So we could write a little Cloudflare Worker that and I actually started doing this, myself. So it will simply just go and fetch the page.
Wes Bos
It'll check if the cookie is set of a theme, and if it is set, it will just find that div and swap out the actual class name of theme dash light with themedash dark or themedash century, whatever all the other themes are. And I think that's part of me is like, that's a cool use case. But part of me also was like, I don't want to add, like, a whole layer of of abstraction on top of this type of thing just to to sort of work with the the cache. So that is a kind of an interesting use case, and that is a pretty common use case for using an edge function or a worker in front of your application.
Flash of unstyled content as client-side solution
Wes Bos
Yeah. It's I I think that was the part that a lot of people misunderstood is that if you only switch a client side, yeah, you get this quick little flash of flash of light mode or flash of dark Node. And Yes. Like, maybe at some point, we'll have, like, a header that is sent along.
Wes Bos
I know CSS has light and dark mode, and it's also not that we're loading different CSS files based on that. It's the themes are just a bunch of variables. Right? You can load every single theme in in nothing.
Wes Bos
But the problem is is that you want the class on the HTML element so that the very first render that comes through with CSS has the correct theme, and it's not picking it up on the 2nd render, and you're getting that initial flash. It's not that big of a deal because once you're on the page, you click the Tolinski, and it it's it goes. And, actually, I was trying to see how the the flash would be, and I I had to turn off JavaScript to get the not Svelte version, meaning that I I turned off JavaScript just to see it. It was so fast that I wasn't even seeing the Flash. Interesting. That's as as far as I've got right now. So I'm I've I'm at the spot now where I've written the worker, and I'm trying to finish up just doing a client Node. And then we'll see because oh, sorry. I should say because we're using Cloudflare, we don't have the ability to do the cache key thing, which I think would probably be ideal. And I don't really care that much to to switch any of this infrastructure. You know? So I think I'll I think we'll we'll try both, and we'll see kinda which one is is worth it.
Wes Bos
Guess what? Yeah. I just turned off SSR. Hey. You can do that, and it works really well. Go. Yep. Well, it should also say that it will also not flash for light and dark mode defaults.
Wes Bos
It's only when you explicitly go in and overwrite your theme. Say, I am light mode on my computer, but I'm turning on dark Node, or I am dark mode. I'm turning on the level up theme. So because CSS has native support for light and dark or system, it won't flicker. It's only when you explicitly go override it, and that that is a bit frustrating.
Wes Bos
Beautiful. Alright. Thanks for tuning in. Catch you later. Peace.