As many of you have probably heard, a new rendering backend is being worked on for Godot. One of the most common comments from potential users evaluating Godot is that, for 2D, Godot is awesome but for 3D it's pretty far from the mainstream alternatives.
For Godot 3.0 (our upcoming release) we are working hard to change this.
Our goal is to have a modern, clustered renderer that supports everything mainstream engines support, including PBR, global illumination and flexible shader editing. As always, honoring the Godot tradition, this renderer will be super easy to use and run on as many platforms as possible.
If you want to know more about what's going on, please keep an eye to our devblog.
To add more insights, here is a roadmap about the things that need to be done to make the new renderer that will match our objectives, divided in what is done, and what still needs to be done:
Following are the details and explanations of what each of the tasks that Milestone 1 consisted of:
Godot 2.1 compiles for OpenGL ES 2.0 by default. This task consisted in adjusting the headers and includes to use the OpenGL ES 3.0 version.
Currently, development is done under Linux using the MESA driver, which has full OpenGL ES 3.0 support, included the
#version 300 es shaders.
Questions that often arise are:
The answer to this is that the main difference between OpenGL ES 2.0 and OpenGL ES 3.0 does not lie in "what extra things you can do". Instead, it is more a matter that everything is done differently. As pretty much everything is done differently and more efficiently in OpenGL ES 3.0, making functionality optional does not make any sense. As such, ports for different platforms must be kept separate as they share little code.
Here are some examples of this:
OpenGL ES 3.0 supports many things that ES 2.0 does not, mainly integer types, integer samplers, etc. The syntax is also different, as ES 3.0 uses the more flexible concepts of in/outs (while ES 2.0 uses varying and more constants). The shaders being therefore completely different, reusing them is out of the question:
As you can see above with a piece of code from the canvas rendering shader, both OpenGL ES versions differ significantly in syntax and features.
In ES 2.0, all uniforms must be set individually, while in ES 3.0 you can use UBOs (Uniform Blocks) which are really useful and handy. UBOs can be set once and shared between all shaders and shader variants. The lack of these in ES 2.0 produces shaders that are bigger and their parameters must be set via several function calls.
A nice example of how UBOs make everything simpler and more efficient is in the piece of C++ code that sets up the light parameters into a shader in the OpenGL ES 2.0 backend:
And now the same version in the OpenGL ES 3.0 backend, using UBOs:
Fantastic, isn't it? In ES 2.0, the light type must be transferred into the shader by setting individual parameters (uniforms) and, if the shader changed, the setup function has to be called again.
In ES 3.0, the parameters are in a shared structure. It's only set once until the light changes, and is shared between all shader versions.
Likewise, vertex arrays are hugely more efficient. In ES 2.0, each array pointer (normals, vertices, tangents, uvs, etc.) had to be set up manually for each type of geometry. In 3.0, dozens of calls are replaced by a single call to
glBindVertexArray(), via VAO (vertex array objects).
Skeletons are more or less the same in 3.0 (drawn via texture), except that it always works using hardware (ES 2.0 does not mandate vertex texture fetch).
For Blend Shapes, Transform Feedback can be used, which allows them to work using hardware acceleration.
In the ES 2.0 backend, they were both transformed using the CPU, resulting in a huge performance degradation.
OpenGL ES 3.0 provides hardware instancing (
glDrawArraysInstanced), which means that MultiMesh can be drawn using a single draw call in 3.0, vs multiple calls in 2.0.
As a result, some stuff such as foliage (grass), gridmaps, etc. can get a huge performance boost.
Thanks to Transform Feedback, it is possible to process particles using the GPU. This means that dozens of thousands of particles can be drawn effortlessly, and include some features such as collision against the static environment by capturing depth/normal maps.
As the above features are implemented in the following milestone, more work will be documented.
The Image class had to be refactored for more modern data types. There is a nice devblog post explaining what was done.
This is covered in the above-mentioned devblog post too.
The rendering API itself (VisualServer class) will not change much. There is also another devblog post explaining the rationale behind it.
The aim behind this is to make it easier to understand. Too many programmers complained of the code being too packed and cryptic.
The class design of the new visual server and rasterizer is like this:
Each element is explained below:
A devblog entry about this was already written, should be informative enough.
Godot used to run in plenty of platforms in the past, such as:
Different platforms used different formats for storing vertex array data, and even endianness (e.g. PS3 was big-endian) would affect the format. To overcome this limitation, Godot stored this data as individual, uncompressed arrays and then converted it to each platform on the rendering backend at load-time.
Nowadays, vertex data is more or less standard, and all relevant platforms are little-endian. As such, we can safely store a big chunk of binary data and a small description of where everything is.
This makes loading/saving meshes much more efficient.
The Godot 2.x scene API performed a single function call into the rasterizer for adding each element to a render list. In Godot 3.0, a whole list is passed in a single virtual function call.
This should improve performance considerably but, as a result, more data had to be exposed by the Rasterizer API. This was solved by adding a
Texture and color information edited by users exists only in the SRGB colorspace. This happens because monitor colors are adjusted by a Gamma function, elevating them to a roughly 2.2 or 2.4 power.
To make lighting more realistic, all computations must happen in a linear color space, then converted to Gamma at the end via tonemapping.
Godot 2.x already supported linear space rendering, but this was optional. In 3.0, as we are aiming to a more realistic and high quality backend, the only supported rendering mode is linear.
Thorough investigation was carried out on more modern rendering techniques for Godot.
As a result, we decided to use the Disney PBR specification.
Godot will use a similar parameter set for materials and shaders.
The most common way to implement PBR in real-time is to use a pre-filtered cubemap for material roughness. This makes the reflected light more or less smoothed on demand:
Cubemap filtering was implemented and it's working well, but doubts arise whether using this or dual paraboloid maps is better. The reason is that cubemaps don't blend well between cube sides in several platforms.
A new FixedSceneMaterial resource was created, which allows editing simple materials without having to edit shaders manually. It also has the advtange of reusing shaders for similar material configurations:
The minimum required parameters for PBR are implemented and working:
All gizmos were converted to use the new FixedSceneMaterial, as mentioned before.
Additive lighting has been added for the PBR backend (in Milestone 3, clustered lighting will be added).
Godot 2.0 used individual textures for each shadow map. In the wake of more modern techniques such as clustered renderering, it is required that all shadowmaps are contained within a single texture.
Research was done first into dynamic allocation strategies for light shadows into a shadow atlas, but nothing useful was found. Every dynamic scheme implies moving around shadowmaps if no more space is available, which incurs a considerable cost.
In the end, a more static approach will take place. The shadow atlas will be divided into 4 "Quadrants" and the user will be able to specify how they want each of them subdivided. A default subdivision should cover most use cases:
But the possibility is open for developers to tweak this subdivision for games that might look better with a different scheme.
The logic to tell which cell size must be used for which light is straightforward. Every time the camera moves, each visible light computes a "coverage" value, which represents their size on screen, as example:
average_screen_size = (screen_width + screen_height) / 2 coverage = diameter_in_screen_pixels / average_screen_size
The coverage is then a value ranging from 0 to 1. To determine which cell size it must be used, the following logic applies:
desired_cell_size = nearest_power_of_2(largest_cell_size * coverage)
This has been our first report on the new renderer progress towards Godot 3.0 new renderer, hope everything was clear!
If you are interested in seeing what each feature looks like in the code, you can check the gles3 branch on GitHub.