Caveat, there's no announcement of any changes in this, this is all just fascinating info if you're into that sort of thing.



Layering, Sharding, let's dive in.



So, in what we call Mainline (modern game, latest expansion) we use tech called Sharding ever since Warlord of Draenor. Once upon a time, a realm ran one copy of each zone locally on it's own hardware, if you ran after someone and they crossed into another zone, you followed them, it was all running in the same place, you could kite mobs across zones, escort quests had NPCs pathing across zone boundaries.



Problem, as hardware improved and performance improvements were made, expansion after expansion we raised the cap on the number of concurrent players we allowed logged in. If everyone wants to play on Tich, and we have the technical capacity to lower queue times, why wouldn't you? Ah, but the realm still only ran one copy of the world, eventually the realm has so many people on it that if even a small percentage of them congregate in the same place its going to be way more than the CPU core that's running that zone can handle, in comes the lag, you can multithread to your hearts content, eventually you're going to have a single thread bottleneck that will limit what you can do.



This all came to a head in Warlords of Draenor when in the face of a very successful launch the starting zones in WoD were completely unable to handle the load they faced. The sadly departed @kurtismcc (not dead, just left Blizzard...) worked for three days and nights (we don't condone that sort of behavior anymore!) and created the first version of sharding, this new tech allowed us to run multiple copies of the world and split people between them. Finally we could run multiple copies of Draenor and handle the load between them.



With Legion this tech was made into a fully fleshed out feature and each individual zone could now run limitless copies, once a threshold of players in the zone was reach, we spun up another copy, repeat as necessary. It made the game feel less cohesive, but it meant the realm cap and zone capacity were now disconnected entirely, it also gave us the ability to have specific values for specific zones (not all zones are equal in terms of CPU cost).



Good times. Now, along comes Classic and we have problems again.



Classic launch was heavily targeting #NoChanges, they wanted the game to feel as identical to original launch as possible. Sharding was clearly out, but there was also the strong belief that launch was going to be big (but no where near as big as it turned out ;)) and interest would drop off fairly quickly as people got their nostalgia hit and moved on. In that world we didn't want to have to launch hundreds of realms at the only tiny realm caps we used to use and then have lots of dead realms a month later. So we had to find a way to use modern sized realm caps but not crush the zones with players. There were also two other notable tech problems with Sharding that needed addressing to support Classic behavior, mobs can't path across zone boundaries in a Sharded world, each shard (copy of a zone) can be running on completely different datacenters, so while technically solvable its very expensive and complicated to try and make this work and lastly in PvP situations it was greatly desired that crossing a zone boundary shouldn't be an escape, both people should keep chasing each other into the new zone and in a Sharded world, they could end up on different destination shards.



In steps the legendary Omar, he took sharding and started stitching the shards together into something that became known as Layers. Yes, a Layer is actually just sharding with a lot of sticky tape. By gluing these shards together it became possible to fly messages around and handle mobs moving between zones, players were assigned a specific Layer so if two people were in Westfall and ran into Duskwood, they would remain on the same layer and so end up in the same copy of Duskwood as each other. Hurrah, great success! This tech was required to have any hope in hell managing the AQ gate opening, eugh, shudder. That's a whole other story.



Through Classics original launch we slowly kept lowing the max number of layers as originally we had promised Layers were temporary, but (un)fortunately you didn't stop playing Classic, it wasn't the flash in the pan some had predicted. You really, really liked Classic. The servers still had tons of players on, so fulfilling this promise was tough and it was only towards the very end of Classic we were able to run single-layer, and even then, that was a very spicy Layer.



From TBC onwards we had to embraced layering as an evergreen thing, the realms were just too big and demand was too high to think about single-layering anymore.



Now remember what Layering was designed for, trying to make a semi-permanent copy of the world that you would live on for the duration of that week? Well people didn't play the game that way. The Asmonlayer was born, meta-gaming meant that if the Alliance were blocking Blackrock on Layer 1, well then the Horde will all try and get to Layer 2 instead. The game evolved around Layers existence and while Layers were once imagined to be largely invisible to regular day-to-day play them became very much part of gamesmanship of Classic.



Ok, history over, what's going on now and why is Layering suddenly a hot topic in SoD?



Layering is more rigid than Sharding. We dynamically spin up layers based on load, and historically we didn't collapse layers when load fell, it was felt important once upon a time that the world feel full at peak and empty off peak, just like it was back in the day. But, like a lot of our early assumptions, that didn't match what people really wanted. They wanted to see other people (shocking in an MMO I know) they wanted to be able to form random groups with strangers, they wanted enough people for the world to feel populated regardless of the time. With SoD we for the first time enabled the tech that collapses layers as load falls so it tries to keep every layer significantly populated, that change has been *mostly* successful, it does have some behavior we'd like to change (moving you multiple times, potentially splitting up groups etc). But as a result you're now seeing yourselves "layering" more often, normally this was an opt-in thing, if you join a party with someone on another layer you expect to move, now for the first time in Classic you could be moved randomly without warning.



The biggest problem we are slamming into at the moment is some early code assumptions for layering are now very problematic, all zones in a layer have the same number of players that can be in the zone. We can't for example say, 500 people in Ashenvale, 1000 people in Orgrimmar, they all use the same value. This is a nightmare when zones perform very differently based on what players are doing in that zone (idling is less CPU than a giant PvP slugfest). Layering is also very expensive from a hardware perspective, remember if the entire server goes to STV during Bloodmoon, we spin up entirely layers to handle the extra STV, so we may be running 16 empty copies of every other zone just because we need 16 STVs, gross. There's also a host of features that zones can't have in a Layered world that Mainline Sharding can do, mainline can do Warmode zones that balance faction 50/50 for example, and we have have different targets for population based on PvE vs PvP.



So right now, world PvP and layering is proving problematic and hard to balance. As always, we're working on it and good things will come. Come Monday we're going to diving into reports of people getting split up when they were already on a Layer together (that's not expected, layering should occur on party changes only) and Classic team already mentioned they're working on a change to keep parties together in the event of a layer change due to excessive load.



The baby just woke up so I'm going to stop jabbering on here and get back to #DadThings, happy Sunday!



Layering Part 2: Design & Engineering Clash!



Dramatic title, not actually any conflict here. So what's the deal with the increase in grumbling around layering in Season of Discovery. Because we've not actually changed anything, Classic, TBC and Wrath all ran the same tech.



The Classic team is trying to add content to Classic that "feels" like it belongs in Classic. One of those things is World PvP. Tarran Mill vs Southshore for example is iconic, it spent much of Vanilla being effectively a permanent outdoor PvP battlefield. So they created the Ashenvale event, and now the STV Bloodmoon event to try and add some extra spice and carrot to this behaviour, overall when it works folks enjoy it, good stuff. But why has it been so problematic from a layering standpoint?



Three issues player have faced -

1. Lag.

2. Inability to group up on the same layer.

3. Getting removed from a layer you were on mid-fight.



Let's hit each of these, lets start with lag. Lag that we're talking about here is lag in the simulation itself, i.e. you're hitting buttons but the game is taking a long time to acknowledge those pushes. This lag is caused primarily by a lack of available CPU, there are some single threads that can only run on a single core and if that core is at 100% it can't go any faster, if there's more work than it can handle in one server tick, that normally 50ms tick is going to take 500ms and you're going to start noticing delays on things.



All sorts of things contribute to this lag, players moving around, players casting abilities, mobs spawning etc, these all take CPU time to compute, the main thing we can do to control lag is usually balance the number of players in a zone because players are by far the most expensive thing for the server to handle. The worst thing for the server is lots of players on top of each other fighting, because then every Arcane Explosion has to get calculated against hundreds of players, are they in line of sight, do any of them have resists, shields, how much damage do they take, are they in range of the AoE... everything the server has to figure out.



So, PvE fighting in a clump is expensive, PvP is really expensive. But we just made content that does exactly the thing that the server is worst at! Argh! This isn't a failure of design, Blizzard isn't driven by engineering when it comes to gameplay, generally speaking its engineering's job to allow Design to do whatever makes the game the most fun (there are limits ofc, resources aren't infinite). So, when Ashenvale first started running into lag, the first port of call was to reduce the player targets for the zones (remember in Layering, zones all share the same cap so this meant we were having less players in *all* zones). Less players, less work for the server, less lag. Throughout phase 1 we were able to dial in these numbers to find one that got as many players as we could into the zone without it falling over (some lag in mass-PvP is honestly expected as players want to see the sea of enemies).



2. Alright, folks can't manage to group up together and while forming a raid people are getting split up like crazy into different layers, why? The layer system has a few values, one of those is the "Target" number of people we want in a given zone, another is the "Max" number we want in a zone. Remember when you log in for the first time in a week, you're assigned to a specific Layer, that's your layer for the week. When you group up you can move to another layer but outside of that group you should return to your originally assigned layer. Once a zone reaches its "Target" value, we start ignoring that layer reservation, basically if you were assigned Layer 1 for the week and you travel to STV and the game finds Layer 1 is over its Target value, we will instead put you on a less populated STV (and layer, since zones=layers in Classic). This can be the first way that you struggle to group up with friends.



But, you're smart, so you form a party or a raid with the person you're trying to play with. At that point we will ignore the Target value for the zone because you really want to play together. You both end up on the same Layer, hurray!



Everyone does the same thing. Uh-uh, now we have a *lot* of people in the same STV, we're going to start lagging if we can't keep the population under control, but everyone keeps pulling in more people via groups. A new value takes effect here, "Max". When we exceed the Max value for a zone, we stop caring if you have a layer reservation, and we stop caring if you're in a party with someone. The copy of this zone is full, we can't keep sending people there or its going to fall over.



This is one of the big issues you're facing at the moment in STV. Since everyone is forming 5 man groups and usually one person arrives ahead of the others, its running into the Max value for the zone and not allowing the rest of your group to come into the same copy of the zone. Think about it, if the zone can handle 500 people, and allows them all in then they each invite 4 more people, ah crap, we're going to crash. So we have to have a hard limit somewhere.



3. Getting layered unexpectedly after you've managed to all get together. This is the first one that I would consider a bug and not just a bad user experience (some bad UX when the alternative is the server dying is OK). Based on what we currently know, if 5 people are all together in the zone in a group, they shouldn't get split apart after that point, but folks are reporting just that, so that's one we need to dive into and figure out what's happening. Now, the good news is Classic is already working on a change that will keep parties together (so even if you move layers, you *all* move layers together) which will mitigate this issue but there's still a behaviour here that isn't expected that we need to get smarter on.



So there you go, that's enough for a Sunday deep dive. The TL;DR? Groups are hard for the layering system to handle, and once a zone is full the UX at that point is gross. As the Classic team already mentioned theirs changes coming to help improve that, and my end I've been noodling the Target & Max values to try and find the sweet spot between Lag and Groups.



As always, if you find this sort of thing fascinating, consider a career in Design or Engineering! Game's are fun and hillariously complex beasts to work on but very rewarding.

