The new Nuke
On February 18, 2016, Valve released the new revamped de_nuke. Alongside the map came several visual improvements to CSGO as well as a bunch of high resolution textures and models.
The map is heavily detailed both with brushes and props and Valve managed to pull a pretty decent optimization job on it. Despite the hi-res content that includes materials with 2 normal maps, specular maps, detail maps and phong, Valve made cheap versions of the textures as well as low poly models of the props to be used for distant places in the unplayable parts of the maps. This fact coupled with Valve using hint brushes, areaportals, func_detail and nodraw made the map run decently on a modern gaming rig. However, the open skybox and the huge amount of details around the map were working against it on the fps side.
Many people were complaining across forums that their fps took a big hit when playing the new Nuke. Driven by scientific curiosity, I wanted to check whether the complaints were valid: did Valve pull a sloppy job when optimizing Nuke or is it down to the players’ computers being low to mid-range and not coping with the new map/CSGO changes?
The revised optimization system
Since I know “nothing” about optimization , I decided to take a look at the map.
I deleted all the hint brushes placed by Valve and rebuilt my own hint system from scratch consisting of vertical, horizontal, and corner hints (Valve did not include corner hints). Valve did not add any outdoor areaportal and their system consisted mostly of indoor ones that separated the inside of the plant from the outside. I added a skybox brush atop the big central dome to allow for outdoor areaportal systems to be viable across the map. A new outdoor areaportal system was added. Finally, I added occluders in several places in the outside yard (Valve did not include any occluder). It was a basic optimization job (not fully tweaked) that took around 6 hours (3 hours on Sunday afternoon and 3 on Monday evening) just to test the grounds.
Due to the sheer size and complexity of the map and the recent changes in the CSGO Hammer compiling tools (vrad), compiling on full final settings was out of the question. I compiled on fast vis and fast rad with “dirty” lighting; the map won’t be as refined as a full-compile one, but it should be enough to give me an idea on the efficiency of the new optimization system versus the stock one of Valve.
Vbsp = 32 sec, vvis (fast) = 2 sec, vrad (fast, dirty light) = 3 hours and 13 min .
Vrad on final would have probably taken half a day and vvis on full would have pushed for a full day (I tried full vis for the sake of it and quit after 4 hours where vis was still immobile at portalflow 4).
Here are the results of the “impromptu” study: on the left is the screenshot showcasing the stock Valve de_Nuke (shipped with the game and certainly compiled on full) while on the right is my own version, de_nuke_will2k, compiled on fast vis and fast rad. Please excuse the black shadow patches in my version as a result of dirty fast lighting from vrad (click for full resolution screenshots).
CT spawn 1
CT spawn 2
T spawn 1
T spawn 2
Below is a recap of the above figures; I also added the test system specs with CSGO graphic settings.
The localized fps increase ranged between 5 and 20 fps depending on the problematic map location. To get a rounded average of the fps of almost all the locations in the map, I used the time demo technique (refer to my paper Optimization Testing in Source Engine). For stock Nuke, the average fps was 162 while for my version, the number was 168.
Valve did a good job optimizing Nuke albeit not a perfect one. Optimization can still be improved as I demonstrated above and fps can still be increased…but not in big increments as someone would wish for.
Remember that my version is done in fast vis (basic and rough visibility calculation) and is only more or less a basic optimization job; if I spend more time tweaking the system and adding some more areaportals, hints, and occluders in more strategical places, changing the props fade distance to more aggressive values, and then compile on full, then I would predict another 5-15 average fps increase on top of the 6 above (only a prediction that needs testing to confirm).
Nuke is the beginning and I suspect all future Valve maps (new or refurbished) will be following the same trend set by the new Nuke: hi-res content, ridiculous amount of details to render at once, open skybox with big sight lines, …
If you have a low to mid-range PC and you are struggling with Nuke’s fps, my revised optimization system will help a bit but unfortunately, it will not magically increase the fps by 100. I believe it’s about time to consider switching to a better PC; things are going to get worse from now on with newer content (assuming CSGO will remain on the current Source engine).
Edit 1 (February 29): I'm appending one of my detailed replies from below to clear up some misconceptions and confusion many people were having after reading the study. This basically deals with full vis vs fast vis.
Full vis Vs fast vis
What you suggest will be unneeded, useless, and a total waste of time; too much confusion in the comments posted in this thread, and these are common misconceptions that will probably need me to write an article by itself.
I’m going to let you in on a little secret (and the viewers too); let’s call it “optimization 601 – masters degree level class”
If your map is tightly and well optimized (and I mean REALLY, REALLY well), then there is not much of a difference between full vis and fast vis. The gap in fps will be non-existing at all, you will get the same fps. If by any chance the fps will differ, then this difference will be extremely marginal.
That’s not to say that everyone should use fast vis; If you are a beginner or intermediate level in Source optimization, then never use fast vis because chances are your map in not fully optimized and the fps will take a dip with fast vis (the PVS and the content rendered will be exaggerated).
However, if you are an expert in optimization and very intimate with visleaves and PVS, then fast vis is as good as full vis when it comes to testing your optimization system. Please check my 2 detailed and in-depth articles on (visleaves) Demystifying Source Engine Visleaves and (PVS) Source Engine PVS - A Closer Look
The PVS in the fast vis version will be slightly looser than the one in the full vis edition and I emphasize on slightly; we are talking here about some 4-5 additional visleaves at best, and when you took extra care to optimize your map and kept your visleaves in check, then these extra visleaves in the PVS won’t affect your fps in any considerable way. Most of the times, the fps will be the exact same between fast and full or slightly lower in some rare cases (single digit difference at best).
I personally do not need vvis while building my optimization systems. Vbsp creating the visleaves properly is more than enough; I can calculate the visibility myself and predict the PVS on the fly by using mat_leafvis, mat_wireframe and r_lockpvs, in addition to the portal file in Hammer. A full vis or fast vis will be the same for me once my optimization system is airtight.
Now if this new found knowledge shook the optimization grounds beneath your feet , then allow me to ease your mind with some solid screenshots and figures to showcase and verify what I just explained.
I will use my CSGO map cs_calm to demonstrate this effect, and we will need 3 versions of it: the workshop version (March 2015) that is basically a final compile (full vis, full rad), a test version compiled with fast vis and fast rad, and a second test version compiled with fast vis but with full, final rad. These versions are exactly identical, the only difference being the compile parameters.
In this first screenshot, on the left is the full compile version while on the right is the fast compile. The fps are the same in both versions.
Let’s check the wireframe shot to get an idea on the PVS and what’s being rendered.
You can see that the content rendered is the same. The only small difference in the PVS is the area behind the ladder (center of screenshot) where an additional 5 small visleaves are rendered in the fast version. Since my map is fully optimized and these additional visleaves have proper “hallway-end” areaportals, corner hints, no draw, as well as an aggressive props fade distance, the content to be rendered inside them is greatly reduced to just a couple of textured walls that have no impact on the fps.
Here is another screenshot between the full version and the second test map with fast vis but this time with full, final rad.
Again, the fps is the same (no surprise here if you understood the concept). The wireframe shot follows.
Again the PVS and content rendered is basically the same with the additional 5 visleaves for the fast vis; the fps is still the same.
As I said before, when your map is tightly optimized (as in “will2k’s seal of approval” optimization ), then it doesn’t matter between full or fast vis, you will get pretty much the same fps results and can compare freely. If your map is not 100% optimized, then DO NOT use fast vis at all.
My study on Nuke is 100% valid even if I did it with fast vis; now you can see why.
Hope the above cleared things up for you and for people viewing this thread
I might be tempted to write an new article about the above.
Note: For all the folks that are confused and have so many misconceptions about vvis, optimization, etc, please read my technical papers and articles about Source optimization in chronological order as they appear in my signature below.
(For the guys confused about hints and fast vis in the posts, vvis has nothing to do with implementing your optimization setup, that's vbsp's job - vvis simply calculates the visibility between your leaves and creates the PVS. vbsp is the one responsible for translating your hints, areaportals, nodraw, func_detail etc into visleaves cuts during the compile). Again, read my papers and all this is explained in great details
EDIT 2 (April 22): Adding a link to my latest article titled Source FPS Cost of Cheap and Expensive Assets.
It should shed some more light on the effect of expensive assets on Source optimization and highlight the effect on frame rate drop that was evident in the new de_nuke (high amount of expensive props/textures in the playable area).