GX2 Release Notes

Release Notes

This page lists all fixes and enhancements made to the 3D Graphics APIs, related libraries, and tools. The numbers appended to each note correspond to our internal bug (and feature) tracking software. It also contains a listing of major known issues.

Please Notice

  • In SDK 2.11.06, the size of GX2SetContextState was reverted back to pre-SDK 2.11.04 size. This change was made due to unexpected compatibility issues. As a result all middleware/applications built against SDK 2.11.04 and 2.11.05 must be rebuilt against a newer SDK. (#10775)
  • GX2 Hardware Issues are described in Cafe GPU Chip Issues.

SDK 2.13.00 [2015-xx]

  • GX2 Improvements
    • Added new API's to schedule GPU Tasks by inserting them into the GPU command stream (#11234).
  • gshCompile Enhancements
    • Added some (incomplete) support for the GLES 3.0 shader language to shaderUtils. ES mode activated by the gshcompile "-t essl" flag. In this mode warnings will be issued about use of deprecated functions, and some GLSL features not supported by ESSL will produce warnings or errors. This mode is not a complete implementation of ESSL yet, so not all shaders that compile with this will compile with a real GLES implementation (and vice versa). However, it will be useful in catching some common portability problems.
    • Added a new predefined symbol CAFE which may be used to distinguish shaderUtils from other GLSL compilers.
    • Fixed an issue where shaderUtils will not output a symbol for a sampler used by the texelFetch instruction. (#7822)
    • Updated gshCompile documentation to mention that -Oall does not include -Oglsl, per a comment from SDSG. (#10836)
    • Fixed a shaderUtils bug which could cause an incorrect assert when two different shaders used the same variable name in different uniform blocks. (#4983)
    • Fixed a shaderUtils bug which could cause an assert if undefined variables were used in expressions with constants. (#8290)
    • Fixed a memory overwriting bug in shaderUtils which could cause large shaders to get an INVALID_OPERATION error even when they are valid. (#11306)
    • Fixed some other problems with large shaders causing crashes or INVALID_OPERATION errors. (#11422,#11463)
    • Changed shaderUtils not to issue a warning when converting positive integer constants to unsigned integers. (#11315)
    • Updated the documentation of uNumALUConst statistic in shaderUtils.h. (#11146)
    • Fixed the shaderUtils warning message for sampler ids > 16 sharing properties with earlier samplers; it now takes into account explicit user binding of samplers. (#11102)
    • Fixed a shaderUtils bug which could cause unused shader outputs to still be enabled. (#10915)
    • Improved the compatibility of the preprocessor used in the -Oglsl option with the "regular" preprocessor. (#11479)
    • Removed some redundant moves output by the compiler when generating code for constant array references. (#9145)
    • Fixed a shaderUtils problem with the symbol size output for elements of varying arrays. (#8059)
    • Fixed shaderUtils detection of recursive functions. These are illegal, but instead of printing an error about them the compiler would run out of memory trying to compile the recursive function; now it does report an error. (#6712)
    • Changed the shaderUtils compiler to not unroll large loops (with trip counts more than 256); this prevents some out of memory conditions and crashes in the compiler. (#6712)
    • Fixed a shaderUtils bug which could in rare cases cause bad code to be generated for conditional assignments that involved swizzling. (#7961)
    • Changed the -Osc option to enable Nintendo specific optimizations in the AMD Shader Compiler; at the moment the only such optimization is to remove redundant moves for burst exports. (Previously the -Osc option set a flag, IF_CONVERSION_HEURISTIC_OGL, which didn't actually produce different results than the default.) (#11636)
  • texUtils Bug Fixes
    • Fixed a bug in TexConv2 where saving a BC4/5 SNORM as a DDS would produce corrupt output. (#3264)
  • GX2UT Enhancements
    • Changed GX2UT shaders so that they use fixed uniform register locations. (#10995)
  • DEMO Enhancements
    • Removed call to LCHardwareIsAvailable in textureLinear2Tiled demo and replaced it with a comment explaining when locked-cache is available. (#11220)
    • Added missing call to DEMOGfxSetContextState to DEMO capture system. (#10453)
    • Added new anti-aliasing demo. See Miscellaneous Demos for details. (#10453)
    • Fixed a problem in the garden/shallowWater demo (garden/shallowWater) where the Y button would not cause ripples. This was due to incorrectly calling GX2UT without resetting all the appropriate uniforms. (#10995)
  • Documentation Enhancements

Major Known Issues

This is a listing of major known issues with GX2 and GPU7.

  • GX2SetShaderMode or GX2SetShaderModeEx must be called when entering or exiting compute mode. Failure to do so can result in a GPU hang. Please see GX2 Compute Mode and Resuming Rendering from Compute Mode for more details. (#10775)
  • When outputting to a 50 Hz PAL TV, there is an issue with the automatic 60-to-50 Hz conversion that can result in tearing of the display if the rendering is running far enough ahead of the display. A work-around that can prevent this is inserting into the rendering loop a call to DEMOGfxWaitForSwap (0, 5). (#10656)
  • GX2CopySurface/GX2CopySurfaceEx doesn't correctly copy small mip-map surfaces that do not have the swizzle set to 0. This occurs with 8x8, 4x4 and 2x2 mip-map levels. Please use GX2UTCopySurface and GX2UTCopySurfaceEx instead. (#8925)
  • GX2SetSemaphore(..., GX2_SEMAPHORE_SIGNAL) does not correctly increment a semaphore in GX2. The current work-around for this is to only use GX2 semaphores as binary semaphores and to use GX2SubmitUserTimeStamp (&sem64, 1UL, GX2_PIPE_EVENT_TOP, GX2_FALSE). (#6420)

For a listing of gshCompile/shaderUtils related issues see GSHCompile Known Issues.

Past Releases Notes

SDK 2.12.04 [2014-09]

  • GX2 Improvements
    • Changed GX2 to print warnings to the log for certain fatal errors when on production mode systems. (#11293)
  • gshCompile Enhancements
    • Restored gx2Win.h header file which was incorrectly removed from the SDK. (#11444)
    • Fixed the offset of structure members within arrays. Previously shaderUtils was incorrectly calculating the position of some structure members in arrays of structures, causing the same offsets to be used for different members. (#11239)

SDK 2.12.03 [2014-08]

  • gshCompile Enhancements
    • Fixed an assertion in shaderUtils when compiling compute shaders. This assertion occurs when changing a "buffer" declaration to a "uniform" declaration. It now returns an appropriate error message. (#11224)
    • Fixed some additional problems with addressing structures in arrays when layout(std140) is in effect (#11103)
  • GX2UT Enhancements
    • Removed references to DEMO library from GX2UT. (#11169)

SDK 2.12.02 [2014-07]

  • gshCompile Enhancements
    • Fixed some additional problems with addressing structures in arrays when layout(std140) is in effect (#11103).
    • Fixed an issue with the experimental -Oglsl option where the modulus (%) operand was switching from integer to floating point modulus. This occurs in combination with -Ofastmath. (#11054)
    • Added missing includes to the SDK that were needed to build against shaderUtils and texUtils. (#11159)
  • DEMO Enhancements
    • Fixed DEMOTest to use OSGetTime instead of OSGetTick. (#11105)

SDK 2.12.01 [2014-06]

  • GX2 Enhancements
  • gshCompile Enhancements
    • Added new GSH2 APIs to shaderUtils for generating display lists offline for GX2Set*Shader APIs. See Offline GX2 Display List Generation for more details. (#437)
    • Removed a GX2_ASSERT that would trigger if a shader used 16 uniform blocks. The last one was reserved for the compiler, but is not currently used by it, so the assert was not strictly necessary. However, note that future versions of the compiler may require this extra space, so we recommend that developers try to keep the number of uniform blocks to 15 or fewer. (#10768)
    • Fixed a shaderUtils problem where some unused sampler symbols would still be included in the output. (#11024)
    • Fixed a problem in shaderUtils where GLSL interface blocks with "in" or "out" qualifiers would fail to compile when -Oglsl is used. (#10868)
    • The -Oall switch no longer includes -Oglsl. (#11054)
    • The -Oglsl option now works correctly with programs that use "textureSize2D" (instead of "textureSize"). (#11084)
  • Documentation
  • DEMO Enhancements
    • Added new demo geometry/baseInstance to show how to use the new GX2DrawEx2 and GX2DrawIndexedEx2 APIs (#10692)
    • Updated demos to allow customizing the asset directory from the command line. (#10788, #10810)
    • Fixed several compilation errors that occur when -Oglsl is enabled for the demos. (#10880)

SDK 2.12.00 [2014-06]

  • GX2UT Enhancements
  • GX2 Enhancements:
  • GFD Enhancements:
    • Relaxed the GFD header block size restriction to allow for future expansion of GFD header block sizes. Fixed GFDGetTexture, GFDGetGX2RTexture and GFDGetTexturePointer so that they can handle GFD files with multiple textures that have mixed and matched mipPtr/imagePtr combinations. Improved read performance when scanning for resources in a GFD file containing multiple resources (#10360).
  • gshCompile Enhancements:
    • A new, experimental, optimization option (-Oglsl) is now available which enables a GLSL->GLSL optimization preprocessing pass. This performs a large number of simplifications to input vertex and fragment shaders, which can sometimes yield large performance improvements. -Oglsl does not currently have any effect for geometry or compute shaders. Note that as with all optimizations -Oglsl is not guaranteed to improve the code, especially on shaders already tuned for the WiiU. Sometimes it can slightly reduce shader performance, so testing of the results is suggested. By default this feature is disabled, and is not part of -Oall (it must be explicitly turned on) (#6118,#7423,#9145,#9171,#8449).
    • Added #pragma optimize( XXX ) pragma enable (some of) the optimizations which are normally specified on the command line or via the GSHCOMPILE_OPTIMIZE environment variable. This pragma does not work with the -Oglsl option, because that pass runs before #pragma parsing, but other options that are normally specified via -O (like -Ofastmath) may be instead specified via the #pragma (e.g. #pragma optimize(fastmath)) (#7507).
    • Fixed a problem which caused incorrect code to be generated for references to inline constant arrays (#10390).
    • Fixed an off-by-one error in shaderUtils which could cause a compile time assert when building shaders with long function signatures (#9335)
    • Fixed a shader compiler problem with arrays of structures causing incorrect UBO size to be reported (#10576).
    • Added check to shaderUtils GSH2CompileProgram3 API to ensure that the calling application has sufficient stack space for shaderUtils to work properly. (#10723)
    • Fixed an issue with shaderUtils loop unrolling where loops with control variables of type "unsigned int" were not being unrolled. (#6500)
    • Fixed a bug in shaderUtils which could generate incorrect code for vector type conversions which had a complicated swizzle. (#9478)
    • Fixed a shaderUtils bug which could cause incorrect code to be generated for array accesses to structures, if the structure has the layout(std140) attribute. (#9484)
    • Fixed a shaderUtils bug which could cause GLSL matrix constructors to generate incorrect code. (#10735)
    • Fixed shaderUtils problems with -Ounusedvar: sometimes it removed variables that it should not have, and it was not activated unless -Oloopvar was also activated. (#10798)
    • Fixed a shaderUtils bug where the -Ounusedvar optimization was not looking inside switch/case statements, causing it to incorrectly discard variables that were only used inside switch. (#10869)
    • Fixed a shaderUtils bug in the -Ounusedvar optimization option which could cause function calls with side effects to be incorrectly removed. (#10882)
  • texUtils Bug Fixes
    • Fixed a bug that resulted in a corrupt texture when converting R16G16 textures and generating mips. (#10813)
  • Documentation:
    • Added documentation for GX2UT functions GX2 Utility (GX2UT) APIs (#6602)
    • Added new GPU7 3D/2D Array documentation to texture organization page and fixed a few errors in swizzle logic for 3D textures. See 3D Textures and 2D Texture Array Organization for details. (#10296)
    • Fixed several missing variable names and incorrect tiling equations in 3D Textures and 2D Texture Array Organization. (#10871)
    • Removed graphics landing page and moved links into "Graphics" tab in main documentation. (#10296)
    • Updated GX2 formatting to match main documentation. (#10296)
    • Updated Deprecated List. (#10296)
    • Updated documentation for gshCompile and shaderUtils. (#10708)
    • Cleaned up several dead links in documentation. (#10791)
  • Demo Changes:
    • Modified shallowWater demo to use the new APIs described in GX2 Replacement APIs. This increases CPU performance. (#6602)
    • Modified the cbm/latticeDL demo to show how DL overruns can be detected. (#8303)
    • Modified the gen_cubemapGS.gs geometry shader source to use arrays of matrices instead of individual uniform variables that are later copied to an array; this change can improve performance considerably in some cases (#2012).
    • Fixed a bug in system/src/demo/gx2/misc/degamma/gammaTest.cpp where the return value of GX2GetLastFrame was not being checked (#10779)
    • Added new DEMO library flag DEMO_ASSETS_DIR. This allows the default asset directory path to be overridden in demos. The new API DEMOGfxLoadAssetFile does this automatically. Updated gsGrass, shallowWater, tesselateGS, and mipmapGen DEMOs to use this feature. (#10788)

SDK 2.11.07 [2014-05]

  • gshCompile Enhancements
    • Fixed a potential crash in gshCompile/shaderUtils that would occur if a shader is compiled that has an error and instruction statistics are requested. (#10855)

SDK 2.11.06 [2014-04]

SDK 2.11.05 [2014-04]

  • Documentation
  • texUtils Bug Fixes
    • Fixed TexUtils::GetMipSize function to use correct method of determining mip-size (#9309).
    • Fixed a bug in texUtils that would cause a failure if 'bugFix2197' was enabled and TC2ConvertTiling was called prior to calling TC2GenerateTexture. Additionally, the fix will now ignore any surfaces that have explicit tile modes specified and only override the surface tile mode if it is set to GX2_TILE_MODE_DEFAULT. (#10369)
  • Demos
    • Fixed a bug in DRC/FormatChanger demo that was failing to clear the cache for the background primitive (#9670).

SDK 2.11.04 [2014-04]

  • New Features
  • GX2 Enhancements
    • Changed default behavior of GX2R Resource Tracking in debug mode to be disabled. This improved DEBUG mode performance when GX2R resource debugging is not needed. (#7999)
    • Fixed a bug in GX2SetVertexUniformReg and GX2SetPixelUniformReg that would print an extraneous warning in DEBUG mode. (#9904)
    • Solved an issue with GX2SetClearDepthStencil that would cause corruption in depth buffers using HiZ and dimensions that are not a multiple of 8. (#9855)
    • Removed all informational prints in GX2 print messages that occur in the logs. To enable these prints increase the verbose level when invoking caferun via the -v flag. (#9809)
    • Added several run-time NDEBUG checks that verify applications are not using the GX2 in such a way that could cause instability. (#8000)
    • Rename GX2_SURFACE_FORMAT_TC_R10_G10_B10_A2_SNORM enum to GX2_SURFACE_FORMAT_T_R10_G10_B10_A2_SNORM because when its format is used as color buffer, alpha value is not handled as expected. (#8271)
    • Fixed GX2CheckSurfaceUseVsFormat so it returns proper texture/colorbuffer use for GX2_SURFACE_FORMAT_T_R10_G10_B10_A2_SNORM format. (#8271)
    • Fixed a bug in GX2SetContextState where an internal register setting was not being restored. This may improve pixel shader performance in some cases. (#10006)
  • gshCompile Enhancements
    • Added support for Compute Shaders via the -c option. (#8525)
    • Updated AMD shader compiler has been updated to the most recent version. This may provide a performance improvement for some shaders. (#8308)
    • Fixed a shader compiler issue which could cause vertex shaders to have an unnecessary NOP at the end. This will provide a (very very small) performance improvement to vertex shaders. (#7894)
    • Fix the accuracy of the value of PI used in the shader compiler A much more accurate value is used now. (Of course any value of an irrational number like PI is an approximation, but the original approximation was good to only 4 digits, the new one is good to the full precision of floating point.) (#8573)
    • Fixed an issue where gshCompile would assert if a #define contained an unsigned integer constant (e.g. #define 2u). (#9958)
    • Changed default behavior of gshCompile regarding uniform/SSBO arrays. Symbols for the first two elements will be listed by default. Added a new flag, -no_limit_array_syms, which removes this limit and restores the previous behavior. See gshCompiler Flag Detailed Notes for more details. (#1238)
    • Fixed bug in gshCompile where GX2UniformVar array was not being filled for GS shaders. (#9963)
    • Added new GSH2CompileSetup3 structure to include four new fields that provide performance statistics for the compiled shaders back to the user. For more details see GSHCompile Shader Statistics for more information. (#9345)
    • Fixed an off-by-one error in code for handling arrays of samplers which could cause samplers declared after an array to have an incorrect index. (#8969)
  • GFD/ShaderUtils Enhancements
  • SLConverter
  • Spark
  • Documentation
    • Updated documentation to note that of the 18 available hardware samplers, only 16 are available for use in GX2. (#9486)
    • Updated GX2 Texture API documentation with details on GPU7 hardware tiling formats and memory organization. See GX2 Texture Memory Organization for more information. (#9078)
    • Updated GX2 Texture API documentation with GPU7 cube map memory organization information. See GX2 Cubemap Memory Organization for more information. (#1862)
    • Fixed documentation in GX2 Texture Swizzle APIs page to reflect that the hardware has 32K pages with each page being 4KB each. Removed "<=" from advanced swizzling section. (#10080)
    • Updated GX2 and Process Switching to point to section to point to main documentation about how to properly switch from the foreground to background. Recommendations for GX2 process switching have been updated for better performance. (#9110)

SDK 2.11.03 [2014-01]

  • GX2 Enhancements:
    • Added performance optimization to GX2Invalidate for the case when invalidating sources caches (GX2_INVALIDATE_UNIFORM_BLOCK, GX2_INVALIDATE_SHADER, GX2_INVALIDATE_ATTRIB_BUFFER, GX2_INVALIDATE_TEXTURE). This optimization only occurs when ptr = 0 and size = 0xFFFFFFFF. (#7631)
    • Added missing DEBUG build asserts for unsupported depth-buffer tiling modes GX2_TILE_MODE_LINEAR_ALIGNED and GX2_TILE_MODE_*_TILED_THICK. (#9804)
    • Added new GX2UT APIs to access hierarchical stencil buffer control registers. This allows for two levels of hierarchical stencil pretesting. See GX2 Hierarchical Stencil buffer (HiS) for more information. (#1436)
    • Resolved incorrect calculation for color buffer slice dimensions that would result in a crash in Spark when creating a linear-aligned color buffer with height < 8. (#8796)
  • Documentation
    • Updated API documentation to include detailed information about the behavior of each API including thread-safety, user heap allocations and whether the API submits GPU commands. See specific APIs for more details or the API Limitations Table. (#8292)

SDK 2.11.02 [2013-12]

  • GX2 Enhancements:
    • Changed behavior of GX2 when display lists are overrun in NDEBUG mode. If this occurs, OSPanic will be called. (#558)
    • Updated GX2Shutdown in DEBUG builds so that it will assert if GX2Shutdown is called before ending a display list. (#8771)
  • gshCompile Enhancements:
    • The shader compiler now notices the case where more than 16 samplers are used, and warns that shader properties use only the bottom 4 bits of the sampler ID (so although samplers 0 and 16 may have different data, they must have the same properties such as filter type). (#9695)
    • A #pragma warning( disable: XXX ) pragma has been implemented to turn off warnings from the shader compiler. (#9695)

SDK 2.11.01 [2013-11]

  • GX2 Bug Fixes:
  • gshCompile Enhancements
    • Added -Ounusedvars optimizer option to gshCompile to remove unused variable assignments (Command Line Arguments). (#3751)
    • Fixed an issue with sampler2DArrayShadow in vertex shaders where it would incorrectly uses the x component as the layer to the array. (#9556)
  • texUtils Bug Fixes
    • Fixed a bug in TexUtils that caused a failure when converting from R16_G16_UNORM to BC5. (#9412)
  • SLConverter
  • Demos
    • Added NERDs Mipmap Generation Demo to texture demos (Texture Demos) (#9319)
  • Documentation
    • Fixed documentation issues with MTX library (MTX Overview). (#9148)
    • Added documentation on GPU/CPU Cache Synchronization using GX2Invalidate (Cache Synchronization). (#9472)
    • Updated GX2R FAQ about extra memory required for resource tracker in DEBUG mode (FAQ). (#7532)
    • Clarified vertex attribute/stream limits and increased allowed number of texture samplers to hardware maximum of 18 (GPU7 Resource Limits). (#9486)

SDK 2.11.00 [2013-10]

  • gshCompile Enhancements
    • Fixed a bug which could sometimes cause the compiler hang while compiling shaders which have arrays in uniform blocks. (#9213)
    • Fixed a vertex shader compiler bug which could cause samplers which are explicitly bound to a specific location (with layout) to be omitted from the symbol table. (#8782)
    • Fixed a bug which could cause the shader compiler to crash with an assert when complex arrays of structs are used in a shader. (#9308)
  • Texture Converter Enhancements (TexConv2Page)
    • Fix for alignment/size calculation for DXT compressed textures. Fix is currently disabled by default, but can be enabled. In TexConv2, specify the flag "-fix2197". If using txUtils directly, pass the TC2Config structure to TC2Initialize with 'bugFix2197' enabled. For runtime, developer can #define GX2_ENABLE_FIX2197 prior to #include "cafe/gx2.h" to get fix globally. To enable the fix on a per-instance basis, prior to calling GX2CalcSurfaceSizeAndAlignment on a particular surface, change its tileMode from GX2_TILE_MODE_DEFAULT to GX2_TILE_MODE_DEFAULT_FIX2197. (#2197)
    • Errors in TexConv2 resulting from file access issues should now print more specific errors in the console. (#8455)
    • Fixed issue with ddsReader that corrupted cube-maps with mipmaps during save. (#2983)
    • Fixed a bug with texUtils that caused corruption when converting a floating-point texture with values exceeding [0,1] to a non-floating-point format. Any values outside of the range will now be clamped. (#3265)
    • Fixed an issue with texUtils that failed to account for surface tile-mode being adjusted during conversion, mainly affecting 3D surfaces. (#8434)
    • Fixed a bug in TexConv2 that caused conversions to fail if the surface format explicitly specified is the same as the source texture. (#2493)
    • Fixed a bug in texUtils that caused RGBA channels to be ordered incorrectly when converting RGBA1010102 textures. (#3078)
    • Fixed a bug with texUtils that was truncating floating point channel values when converting to integer, rather than rounding to the nearest. (#3200)
    • Fixed issue in texUtils that caused size calculations of very large surfaces to return as 0, due to overflowing 32-bit range. (#3426, #3427)
  • SPARK Enhancements
    • Fixed bug in Spark that would cause a second capture to crash if capture memory were exceeded. (#9211)
  • Documentation
    • Add documentation for HiZ and PreZ (Depth Buffer Optimizations and Pitfalls) (#7427)
    • Added note about how each texture format maps to texture cache line to GX2 Performance Tuning page.(#2330)

SDK 2.10.01 [2013-07]

  • Known Issue:
    • GX2Set*SamplerBorderColor does not perform a necessary flush before setting the relevant color registers. As a result, the registers may be updated at the wrong time, resulting in incorrect colors being rendered. A work-around is to call an API such as GX2SetGeometryShaderInputRingBuffer prior to setting the border color. This API contains the needed flush; it is not necessary to use GS mode when calling this API for this purpose.
  • GX2 Bug Fixes:
    • Fixed bug where GX2ReadUserTimeStamp zeroes out the top 32 bits of the timestamp. This bug was introduced in SDK 2.09.06. (#8319)
    • Fixed bug where GX2 would falsely detect a GPU hang (see GX2SetMiscParam) when returning from the Home Button Menu causing some titles to hang (#8496)
    • GX2_PERF_F32_[VS|GS|PS]_ALU_TEX_RATIO (GX2PerfMetric) counters have been corrected to be pure ratios, removing the incorrect multiplication by 100.0. This fix is only applied to applications build with 2.10.01 or later. (#6890)
  • Texture Converter Enhancements (GX2 Tool: Texture Converter 2)
    • Fixed issues with ddsReader that caused failures when attempting to convert from GTX back to DDS with certain formats. (#2578)
    • Fixed a bug with texUtils that caused corruption when converting from a signed BC4/5 format. (#3264)
    • Fixed an issue in texUtils that results in a crash when attempting to combine multiple 3D-textures as mip-levels into a single texture. TC2CombineAsMips now correctly calculates the appropriate depth for a given mip level. (#8252)
  • Documentation/Demos
    • Changed allocator used for GX2PerfInit in the perf/benchmark demo to use a dynamic allocator within that fixed buffer size. (#3903)
    • Added note about stream out strides to GX2StreamOutBuffer (#7471)

SDK 2.10.00 [2013-06]

  • Shader Compiler Enhancements:
    • Spark 1.1 Shader Features: Added support to generate auxiliary shader symbol files used by Spark to provide enhanced shader support features which includes shader source code display and shader disassembly which incorporates symbolic information for readability. (#6880)
      • A non-intrusive 16 byte ID is appended to all shader binaries generated by gshCompile.exe by default.
      • Added gshCompile option -nospark to NOT create a Spark shader symbol files and does NOT append the 16-byte ID to the end of shader binaries.
      • Added gshCompile option -sparkdir to override the default output directory (C:/Users/username/AppData/Local/Temp/Nintendo/Cafe/ShaderSymbols).
      • Added new entry GSH2CompileProgram2, which enables enhanced shader support in Spark. Game developers using ShaderUtils.dll directly are encouraged to migrate to this new API.
      • See Spark 1.1 documentation for more details
    • Added first pass of optimizations to gshCompile via the -O flag and the GSHCOMPILE_OPTIMIZE environment variable. Warning: Early adopters found that enabling optimizations across all shaders had varying results. Some draw calls improved, while others did not. At this time, we recommend only enabling these optimizations on specific shaders after a performance improvement has been verified. The Spark team is working on a feature to make this experimentation easier. The optimizations allowed are a comma separated list of one or more of the following. For example "-Oall,no-fastmath" would enable all optimizations except for the fast math one. The same result could be obtained by setting GSHCOMPILE_OPTIMIZE to "all,no-fastmath". Optimizations will remain off by default until they are proved reliable. See the gshCompile documentation for full details. (#7424, #6872, #1803)
      • loopvar: enable detection of loop variable initialization outside of for loops
      • loopexpr: enable unrolling of loops with complicated expressions inside them
      • const: enable some constant propagation optimizations
      • fastmath: enable optimizations which may violate strict IEEE floating point semantics
      • sc: enable some additional optimizations in the AMD shader compiler
      • all: enable all optimizations
      • no-*: disable any of the optimizations above
    • The strings in GSH2CompileOutput (which contain debug info, warning, and errors) are now cleared on each call to GSH2Compile. Previously the new output was appended. Some developers chose to create and shutdown a new handle for each shader compiled to work around this inconvenience, which has a noticeable performance impact. With this fix, a single handle can be re-used for multiple calls to GSH2Compile. (#2042)
    • The shader compiler now allows uniform bindings (Explicit Hardware Location Binding) to be omitted if the GL_EXT_Cafe extension is enabled. Uniforms that are not explicitly bound will be assigned to free slots automatically by the compiler. (#4122)
  • Texture Converter Enhancements
    • Fixed an issue in texUtils that caused corruption when generating mipmaps for BC5 textures during conversion. (#3870)
    • Fixed an issue in ddsReader that prevented BC4/5 textures from being loaded due to having a BC4/5 signature instead of ATI1/2N. ddsReader now accepts both signatures. (#3870)
    • Fixed an issue in txUtils where certain conversions going through ATICompress_ConvertSurfaceFormat were swapping the source image data's 'R' and 'B' channels, without swapping them back afterwards. (#5969)
    • Fixed issue in txUtils that was causing mipmap calculation to fail in Cube/Volume textures with 1 mip-level. This was due to mip-level access not accounting for textures with multiple slices. (#6993)
  • GX2 Enhancements:
  • Other Enhancements:
    • Added SLConverter (an HLSL->GLSL conversion tool) to the SDK. This was previously released as a separate package. See GX2 Tool Overview. (#7517)
    • Fixed up demo makefile dependencies to properly build assets only when needed (#6778, #7928)
    • Updated DMAE_DEFAULT_TIMEOUT_IN_MILLISEC to a more reasonable value (#8063)
  • Documentation

SDK 2.09.10 [2013-04]

SDK 2.09.07 [2013-03]

  • Breaking Changes:
    • We reverted the breaking change from SDK 2.09.06, as some developers still require 32-bit tools. Please skip SDK 2.09.06 to avoid breaking changes in tool paths. (#6956)
  • Bug Fixes:
    • The breaking change from SDK 2.09.03 to modify shaderUtils and texUtils for C-compatibility was corrected. There was a mismatch between the headers and the binaries due to incorrect build procedures. (#1723)
  • Enhancements:
    • GX2_MAX_SAMPLERS was increased from 16 to 18 to reflect the actual hardware limit. (#6051)
    • Added assert when GX2DrawDone is called from a display list. (#7393)

SDK 2.09.06 [2013-03]

  • Breaking Change:
    • Most 32-bit tools have been removed. Graphics tools now take the form toolname*64.exe (e.g. TexConv2.exe is now TexConv264.exe); gfx binaries are now located in system/bin/tool, gfx libraries are now located in system/lib/tool. (#6956)
  • Enhancements:
    • DMAE alignment restrictions for copy & fill operations are reduced from 8 bytes to 4 bytes to match hardware requirements. (#7205)
  • Shader Compiler Enhancements
    • Enable shaderUtils work-around for bug #4122 by allowing any constant expression in layout(binding=xxx). Previously only constant integers were allowed. (#4122)

SDK 2.09.05 [2013-02]

  • Bug Fixes:
  • Shader Compiler Enhancements
    • Fixed an internal bug which was causing use of gl_FragCoord to sometimes cause an assert. gl_FragCoord should be usable now. (#4541)

SDK 2.09.04 [2013-02]

  • Enhancements:
    • Updated demos to properly deal with arbitrary frame buffer sizes. (#2057)
  • Shader Compiler Enhancements:
    • The shader compiler now removes unused sampler symbols, even if they are declared uniform. (#5429)

SDK 2.09.03 [2013-02]

  • Breaking Change:
    • Functions and structures in shaderUtils and texUtils were modified (bools and namespaces were removed) in order to be C-compatible and to compile under GHS. Please update your tools to the newest dlls at the earliest convenience. Until then, GX2 is compatible with the tools from SDK 2.09.03. (#1723)
  • Enhancements:
  • Documentation:
    • Many minor updates and corrections were made throughout the documentation. (#3010)
    • Added clarification for restrictions on use of GX2SetTVBuffer with DRC camera. (#6720)

SDK 2.09.02 [2013-01]

  • Enhancements:
  • Bug Fixes:
    • Added additional coverage to the WPG corruption bug work-around. This may increase the CPU cost of GX2 commands. We expect the overall impact to be small (< 1%). Please contact support if significant performance drops are detected. (#6442)
    • Updated GX2InitDepthStencilControlReg() to correctly initialize all associated registers. Some elements of the reg structure were uninitialized. (#5940)
    • Updated GX2UTCopySurfaceRect to adjust the viewport and scissor according to miplevels. (#6494)
  • Documentation:
  • Shader Compiler Bug Fixes:
    • The shader compiler now correctly marks uniform blocks as used when they are only referred to in return statements. (#6672)
    • Fixed evaluation of the clamp, min, and max operators in shader programs when the arguments were constants with different types; previously these were promoted to double, which often was incorrect, now the default type is float unless one of the arguments is a double. (#6833)
    • Fixed the parsing code for the output path for dump files produced by the -d option so that paths containing "." or ".." work correctly. (#2128)

SDK 2.09.00 [2012-12]

  • Enhancements:
    • New APIs to deal with GPU hang detection and GPU reset have been added. It is also possible to configure automatic GPU hang detection and reset. For details, please refer to the APIs GX2SetMiscParam, GX2GetMiscParam, and GX2ResetGPU. (#2436, #4691)
  • Shader Compiler Bug Fixes:
    • The shaderUtils library could crash under 64 bit versions of Windows if addresses above 4GB were needed. (#6050)
    • Several bugs related to array accesses in shaders have been fixed. For example, accessing an array of constants works correctly now, and array accesses using non-constant integers which would sometimes produce bad code should work properly in this release. (#2447, 2982, 4611, 4501)
    • The shader compiler could sometimes output incorrect code for the "else" clause of an if statement with an integer not-equals comparison. This should be fixed now. (#5346)
    • Some incorrect asserts in the shader compiler (for example when using gl_PointCoord) have been fixed. (#5858, 6209)
    • Shader compiler expressions that use log10() should no longer produce an assert. (#6343)
    • Error strings returned by shaderUtils are now properly terminated with a 0 character (previously they could contain some garbage characters at the end). (#6330)

SDK 2.08.10 [2012-11]

  • Bug Fixes:
    • The system crashed if the power button is pressed while running some geometry shader demos. This happened because they called GX2RDestroyBuffer on a MEM1 surface during shutdown (i.e., when in the background). The function touches MEM1, which is illegal when an app is in the background. To avoid this issue, please use GX2RDestroyBufferEx with the GX2R_OPTION_NO_TOUCH_DESTROY option. The demos were corrected. Please audit all software to ensure that GX2RDestroyBuffer is not called on MEM1 surfaces during shutdown. (#6143)

SDK 2.08.04 [2012-10]

  • Bug Fixes:
    • Write-Gather corruption issue: We have implemented a lower-cost work-around for this issue. GX2 APIs should be closer in performance to what they were for SDK 2.08.02 and previous. While we are not 100% assured that the bug is prevented, the odds are greatly reduced. If you encounter a hang that results in a TCL command buffer dump, please send it to us so we can analyze it for signs of this problem. Work on this issue is continuing. (#4685, #5844)

SDK 2.08.03 [2012-10]

  • Bug Fixes:
    • Write-Gather corruption issue: Under certain rare circumstances, GX2 commands written to a command buffer or display list could become corrupted by the CPU. When this happens, it typically causes a GPU hang, though graphical corruption is also possible. We have implemented a work-around to prevent any corruption by modifying the GX2 APIs. Unfortunately, this does result in some increase in CPU costs for most GX2 APIs. We are still working on this issue and will likely release a version with improved performance in the near future. (#4685, #5844)
  • Warnings:
    • Setting GX2SetSwapInterval() to 0 should only be used for debugging. A swap interval of 0 is supported for immediate flipping. Use of this setting will cause tearing, and is also not compatible with PAL output. Do not set GX2SetSwapInterval() to 0 in game submissions. (#6156)

SDK 2.08.02 [2012-10]

  • Shader Compiler Enhancements:
    • The shader compiler now correctly propagates the flat interpolation keyword from an interface to its members.
    • gshCompile now supports #include. This support is still somewhat preliminary, and may have issues. It is implemented as a pass before the rest of the preprocessor, so errors are reported with #error directives. Includes may be nested up to 24 deep. The support is only in gshCompile; other programs using shaderUtils must implement #include themselves.

SDK 2.08.00 [2012-09]

  • Known Issue:
    • Even if you are doing Z-only rendering, it is still necessary to call GX2SetColorBuffer for render target 0 in order to make sure that the AA registers are set up properly. For Z-only rendering, the colorbuffer used here does not need to have a proper image buffer associated with it. We will find a better solution for this issue in a future SDK. (#3895)
  • Bug Fixes:
    • The Known Issue in SDK 2.07.03 is fixed. It is no longer necessary to do the mentioned GX2Invalidate when changing shader modes, since this call is now included in the relevant APIs. (#4336)
    • The use of GX2InitSamplerFilterAdjust has been made workable by properly setting a related variable in the texture setup. (#4956)
    • The high-level perf API assumed an incorrect number of counters for certain hardware units (CP, TD). This has been fixed. (#5456)
    • When using an analog PAL (576i) TV, there was a bug in the 60/50 pulldown algorithm when using a swap interval > 1. This has been fixed. (#5478)
    • Fixed issue in matrix library that caused the result to become nan when specifying a zero vector to PSVecMag. (#3263)
  • Texture Converter Tools (GX2 Tool: Texture Converter 2):
    • Fixed issue with decoding of TGA files that could corrupt the final pixel. This only happened if additional data followed the image data. (#1696)

SDK 2.07.03 P2 [2012-09]

  • Enhancements:
    • Added GX2R option flag GX2R_OPTION_NO_TOUCH_DESTROY to use with GX2RDestroyBufferEx when destroying MEM1 resources while in the background. This allows the resource to be destroyed without touching the memory that was allocated. (#5330)

SDK 2.07.03 P1 [2012-09]

  • Bug Fixes:
    • Fixed an assert in shaderUtils which could incorrectly be triggered by linking three shaders which use the same uniform arrays. (#5051)
  • Other:
    • Reset some internal GX2 variables upon foreground release to help detect improper GX2 usage during background running. Such use should now result in an immediate exception (invalid access of 0x00000000). (#4882)
    • The API name GX2SetRasterizerClipControlHalfZ is deprecated. Please use GX2SetRasterizerClipControlEx instead. (#3842)

SDK 2.07.03 [2012-08]

  • Known Issue:
    • When changing the shader mode to either GX2_SHADER_MODE_UNIFORM_BLOCK or GX2_SHADER_MODE_GEOMETRY_SHADER, either by using GX2SetShaderMode or with GX2SetContextState, draws may not work right unless you invalidate the shaders using GX2Invalidate(GX2_INVALIDATE_SHADER, 0, 0xffffffff) (after changing the mode). This will be fixed in a future SDK. (#4336)
  • Bug Fixes:
    • Fixed the assert in GX2CopySurface that checks for scaling to check the size of the mips being copied rather than the base surface size. (#4160)
    • Fixed the sample locations used in GX2 surface conversion operations (UBM internal lib) to match GX2's. (#4799)
  • Other:
    • The target returned by GX2GetLastFrame will always be in format GX2_SURFACE_FORMAT_TCS_R8_G8_B8_A8_SRGB. (#5171)

SDK 2.07.02 [2012-08]

  • Bug Fixes:
    • Fixed an issue where a GX2 "wait" semaphore could bypass a GX2 "signal" semaphore that was sent just ahead of it, possibly causing a hang. (#2512)
    • Fixed an issue where if you called the sequence GX2Init/GX2Shutdown/GX2Init, using a swap interval > 1 did not work. (#3415)
    • Fixed a bug with texture tile apertures; they now work with non-zero swizzle values and non-zero slice values. (#3866, #4322)
    • Fixed a bug where GPU hang dumps were corrupting memory (#4420)
    • The valid gamma adjustment range (GX2SetTVGamma) has been limited to [0.7-1.3]. Please refer to the gamma documentation for more details. Fixed documentation concerning the valid gamma adjustment range (GX2SetTVGamma), which is [0.7-1.3]. (#4610)
    • Fixed an issue where GX2 tile apertures <= 256 bytes did not work.
  • GX2 Perf API Audit: (#2986)
    • Various bug fixes for the high-level perf API.
    • Added functions to decode perf counter and stat names.
    • Modified low-level perf API to allow separation of counter configuration and results buffer.
    • The low-level perf API now properly combines results for multi-unit counters and also provides access to all pipeline stats.
    • Added low-level perf APIs to retrieve counter settings and size of result buffer.
  • Enhancements:

SDK 2.07.01 [2012-08]

  • Known Issue:
    • The function GX2CalcSurfaceSizeAndAlignment will choose an incorrect tile mode for BCx compressed textures below the size of 128x64 pixels. This also affects the texture converter. For BCx textures below the size of 128x64 pixels, please specify a tile mode of GX2_TILE_MODE_1D_TILED_THIN1 instead of GX2_TILE_MODE_DEFAULT. This will be fixed in a future SDK.
  • GX2 core choice:
    • We discovered and fixed a bug that would occur if GX2 were initialized and run on a core other than the default core 1. Running GX2 on the non-default core is not well tested. If you do run GX2 on the non-default core, please let us know right away if you see any issues while switching to HBM or during game startup/shutdown. (#4527)
  • GX2 Memory Management:
    • All GX2-related memory alloc/free calls are now redirected through common vectors that can be set by calling GX2SetDefaultAllocator. This mainly affects GX2R and GX2 debugger/capture usage, as the base-level GX2 does no allocation. (#2395)
  • Shader Compiler:
    • If gshCompile now crashes due to an assert, a specific message will be printed on the terminal, instead of a generic compile failure. (#2131)
    • Fixed a bug which could cause a shader compiler crash. The parser was failing to check that "varying in" geometry shader inputs were arrays with the correct number of entries, so illegal input was getting through to later stages of the compiler, where it caused an assertion and crash with unhelpful error message. (#1836)
    • Fixed a bug which could cause interpolation types of varying uniform block members to be output in the wrong order (alphabetically instead of register order). (#2404)
    • Fixed a shader compiler bug which caused the dFdy intrinsic to always return 0. (#3295)
    • Fixed a gshCompile bug which could cause the clamp() intrinsic to produce incorrect results on integer arguments. (#3448)
  • Other:

SDK 2.06 [2012-06]

  • Enhancements:
  • Bug Fixes:
    • Fixed a problem that prevented GX2ResolveAAColorBuffer from working with the formats GX2_SURFACE_FORMAT_TCD_R16_UNORM and GX2_SURFACE_FORMAT_TCD_R32_FLOAT. (#2461)
    • Shader Compiler:
      • Fixed some crashes that could be caused by use or declaration of variables in the vertex shader that were not referred to in the pixel or geometry shaders. (#1930, 2445)
      • Now enforces the GLSL CAFE extension requirement that the location binding of a uniform block must not be 0. Previously the compiler did not catch this error, which could cause incorrect operation in some circumstances. (#2100)
      • Fixed a bug which could cause incorrect code to be generated when matrix elements are indexed by a variable. (#2561)
      • Fixed a gshCompile crash related to uniform array initializers introduced inadvertently in SDK 2.04. (#3709)
  • Other enhancements (#3158, #3180, #3010)

SDK 2.05 [2012-05]

  • Breaking change:
    • In an effort to significantly reduce SDK size, pre-built texture and shaders assets have been removed from the SDK. To use SDK demos that require these assets, please rebuild the SDK. Demos do not have a build dependency on their assets, so users must call make in $CAFE_ROOT/system/src/demo/assets before running demos. (#2296)
  • Features/Improvements:
  • Bug fixes:
    • Enabled RPLs to call GX2Init, not just RPXs. (#2349)
    • Fixes various stability issues in ShaderUtils.dll, and updated Known Issues. (#1444, #1777, #1920, #2438, #2469, #2572, #2630, #2665)
    • Fixed issue with GX2SetContextState not restoring all state. (#2613)
    • Set HiZPtr to NULL in GX2InitDepthBuffer. (#2412)
    • Fixed Perf APIs:
      • Various stability issues in GX2 Performance Metrics. (#2491)
      • Perf API now correctly frees memory at each new frame. (#2533)
    • Certain GX2 shader-related functions were not being exported; this is fixed. (#2603)
  • Documentation:
  • Other enhancements (#1703, #2086, #2110, #2350, #2377, #2437, #2908, #3025)

SDK 2.03 [2012-03]

  • Breaking changes:
    • Shader structures were modified. Please regenerate all shaders for this SDK. The GFD version number was also upgraded to reflect this change. (#2039)
    • Surface Copy Changes Part 4 (final): Using GX2CopySurface to copy between two different formats or surface dimensions has been removed. Please use GX2UTSetAndCopySurface. See GX2 Surface Copying, Resizing, Re-Formatting, etc. (#1731)
    • Removed GX2Log tool and functionality due to changes in RPL infrastructure. (#2108, #2385)
  • Features/Improvements:
    • Added first milestone of GPU Debugger: See GX2 Tool: PM4 Parser, GX2UT Capture APIs, and GX2 Capture APIs. (#2205)
    • Added new Low-level performance APIs. See GX2 Perf Counter APIs (#2095)
    • TexUtils (#1601):
      • Added format conversion from R32_G32_B32_A32_FLOAT and R16_G16_B16_A16 to R16_FLOAT, R32_FLOAT, R16_G16_FLOAT and R32_G32_FLOAT.
      • Added format conversion from R11_G11_B10_FLOAT to R8_G8_B8_A8_UNORM.
    • Added Clear Surface APIs (source provided). (#2258)
  • Bug fixes:
    • Added GPU hardware bug work-around (AMD #9722) that caused a GPU hang with user-provided shaders. (#2039)
    • Fixed GX2CPUTimeToGPUTime & GX2CPUTimeToGPUTime have been fixed for CAT-DEV MP. (#2462)
    • Fixed stack stomp in SDK-source matrix assembly functions: ASM_MTXConcat, ASM_MTXConcatArray, _ASM_MTXRotAxisRadInternal, ASM_MTXQuat (#2407)
    • ShaderUtils:
      • Fixed uniform block member size mismatch issue that caused a compiler crash. (#2034)
      • Fixed mismatch in compiler-assigned sampler id's in function declarations and function definitions that caused a compiler crash. (#2081)
      • Added symbol for uniform arrays (the whole array itself) in addition to outputting the symbols for the individual elements. (#2109)
      • Fixed bug where certain uniform using if/else and loops the uniform returned > 1024 (loop register) and showed unexpected behavior. (#2171)
    • Fixed GX2_NUM_SPI_VS_OUT_ID_REGISTERS overrun in GX2SetVertexShader with a specific shader (runtime size bug). (#2216)
    • Fixed GX2GetColorControlReg to return correct value for pColorBufferEnable. (#2270)
    • GX2R:
      • Fixed a bug where runtime diagnostic tracking was costing time in GX2RUnlockBuffer even when disabled with GX2RSetDebugOptions() with GX2R_DEBUG_NONE, in DEBUG only.
      • Optimizations resource tracking and guard band checks so runtime overhead is greatly reduced even when all debug checks are enabled.
  • Documentation:
  • Misc:
    • The size output by GX2CalcTVSize for TV render modes smaller than GX2_TV_RENDER_720 is now forced to include quad scan-out buffers. This is to account for the space needed when 60/50 pull-down mode is required. See the GX2 Display APIs page for more details. (#343)
  • Other enhancements (#1555, #1580, #2072, #2074, #2114, #2125, #2130, #2141, #2165, #2177, #2195, #2301, #2359, #2405, #2419)

SDK 2.03 [2012-03]

  • Features:
    • Important Required Change: Finalized scan buffer copy. In order to support every possible TV scan format, it is now required to set up the TV scan buffer prior to setting up the "final" TV render target. The final TV render target is the one which will be copied to the scan buffer. It must be set up using GX2InitColorBufferFTV. Note that scan buffer requirements may vary depending upon TV scan format. It is required to call GX2CalcTVSize to allocate the proper amount of space. Please refer to GX2 Display APIs for more details. For an example implementation, see demoGfx.c, in sections labeled "Setup Scan Buffers" and "Setup render buffer". (#1848)
    • Added support for DRC 30hz update. Single DRC 30hz mode is selected with a new enum value for GX2DRCMode when calling GX2SetDRCBuffer. The display flipping logic accommodates this automatically (see docs on GX2 Multiple Display Output). (#1847, #2061)
    • Added GX2SetVerifyLevel and GX2VerifyCallback. These functions can be used with the debug GX2 library to control the types of warnings and errors generated at runtime. Also it installs a callback that is invoked whenever a warning or error is encountered by the GX2 API (only in debug builds). (#2071)
    • Added support for Adaptive Tessellation. See demo/gx2/tessellate/tessellateAdaptive (#1614)
    • Added ability to render multiple contexts in the demo library. See DEMOGfxAddInstance, DEMOGfxDeleteInstance, DEMOGfxSetInstance, DEMOGfxGetInstance. Similar functions provided for font and DRC modules. (#2062)
    • Added ability to generate fetch shaders on the PC using shaderUtils.dll. (#2010)
    • Added the function GX2IsResolveSupported to check against the list of formats that are supported by the fixed function resolve path. In GX2ResolveAAColorBuffer(), call the new function and assert if the MSAA buffer format is not support for fixed-function resolves. (#1527)
  • Surface Copy Changes Part 3: See GX2 Surface Copying, Resizing, Re-Formatting, etc. (#1731):
  • Bug fixes:
    • Fixed ASM_MTXConcat to save the f14 halves (PS1) of non-volatile registers with propriety. This solves the issue that f14 halves is broken occasionally when returns from the function. (#2059)
    • Fixed support for sampler2DArrayShadow in shaderUtils. (#1442)
    • Fixed support for samplerCubeArray. (#2147)
    • GX2ConvertDepthBufferToTextureSurface works with GX2_SURFACE_FORMAT_T_X24_G8_UINT destination surface. (#2032)
    • Fixed GX2ClearColor to set the correct gamma bits for UBM. This makes the visually of the clear color on SRGB buffer brightly than before. (#2014)
    • Fixed DEMOTestCapture hard-code to SRGB8. Now it's dynamically controlled by current Color Buffer format to set in DEMOGfxInit. (#1938)
    • Enabled gshCompile to support UTF-8 with BOM. (#1752)
    • Fixed bug in TexConv2 when it re-convert binary from gtx to dds with 64bit. (#2127)
    • Fixed bug in gshCompile when it is used with -align option. It didn't add correct padding for GS copy shader. (#2167)
    • Enable flat shading mode. developer can use flat shading by enabling "flat" grammar on GLSL . (#2119)
    • Added runtime GPU BC1 generator sample to src/demo/gx2/texture/runtimeBC1gen. (#1816)
  • Documentation:
  • GX2R
    • In debug builds, GX2RUnlockBuffer can be slow due to calculations for debug checks even when GX2R_DEBUG_NONE is specified. This is fixed in the next release.
    • In debug builds, GX2R requires around 64 bytes per buffer for resource tracking information (plus additional memory for strings if you use GX2RSetBufferName). This memory is obtained from the system allocator via MEMAllocFromDefaultHeapEx(). The game should avoid preallocating the entire system heap in debug builds without leaving any slack. Future releases will support using the GX2R allocator callback for this debug info. In release builds, GX2R requires no additional memory, and GX2RSetBufferName is a no-op.
    • GX2R's detection of CPU-GPU resource contention is not performed on resources set inside a display list currently. This will be added in a future release.
    • A FAQ section has been added to the GX2R GPU Resource APIs.
  • Misc:
    • Moved all gx2 demos from "test" to "demo" directory. (#2013)
  • Minor enhancements (#2088, #2078, #2069, #2041, #2038, #2005, #1942, #1918)

SDK 2.02 [2012-02]

  • Breaking Changes:
    • Policy has been defined for scan buffer resolutions. For example, if the TV output mode is 480i or 576i, then the scan buffer resolution cannot be higher than 1280x720. If the application uses an AA color buffer as its final render target (to be copied to the scan buffer), then its resolution must match the scan buffer resolution. The full list of rules are listed on the GX2 Display APIs.
    • The formula used for SNORM conversion for both attributes & surfaces is now f=max(c/((2^(b-1))-1), -1.0) [f:output c:input b:# of bits for c] Previously, attributes and textures used this formula instead: f=(2c-1)/((2^b)-1). (#1737)
  • Surface Copy Changes Part 2: See GX2 Surface Copying, Resizing, Re-Formatting, etc. (#1731):
    • Added support in GX2CopySurfaceEx to support copying of many sub-regions (rectangles) in a single call. See test/gx2/copy/gx2CopySurfaceExTest & test/gx2/copy/gx2CopySurfaceExTestBC1. (#1587)
    • If the pixel size is 32-bits, copy using GX2_SURFACE_FORMAT_TCS_R8_G8_B8_A8_UNORM format. This gives a noticeable performance boost for some formats.
    • If the requested pixel format is not valid as both a texture and color buffer, use a comparable format that is valid as both a texture and color based on the pixel size. (BC formats were already handled with an earlier change.)
    • NV12 copying is not currently supported (and never was).
    • Added an assert in GX2ConvertDepthBufferToTextureSurface to check that the source and destination buffers have the same number of MSAA samples.
  • Features:
    • GX2R APIs Part 2: GX2R provides formalized support for GPU resources via the GX2RBuffer struct, with the aim of providing increased reliability, new runtime diagnostics, and facilitating future GPU debug and performance tools. Now supports shader programs, stream out, and surfaces. Improved diagnostics for runtime errors such as modifying or deleting resources in use on the GPU. Your feedback is welcomed to help refine the final release. See GX2R GPU Resource APIs and Resource Management Layer API list (#1537)
    • Several demos have been ported to GX2R as examples. For a list please see GX2 Demos List. (#1912)
    • Support for Adaptive Tessellation See - GX2DrawAdaptive, GX2_TESSELLATION_MODE_ADAPTIVE. test/gx2/tessellate/tessellateTypes/tessellateTypes.cpp (#1614)
    • Added support in texUtils for mipmap Generation for GX2_SURFACE_DIM_2D_ARRAY. (#1917)
    • Added GPU command dump when GPU hang is detected. (#1954)
    • Supported TexUtils converts from R32G32B32A32 to R11G11B10 and R10G10B10A2. (#1601)
  • Bug fixes:
    • Fixed critical command buffer related bugs that were introduced in SDK 2.0. These fixes were released in a GX2/TCL patch to SDK 2.0. (#1974, #1989)
    • Fixed bug in a calculation for time in command buffer submission process. It would assert "Warning: GX2 timed out allocating space in the command buffer." (#1796)
    • Fixed shaderUtils to output the correct size for Uniform Blocks. (#1962)
    • Fixed gfd to use proper alignment macro for shader pointer with -oh option. (#1997)
  • Minor enhancements (#1435, #1721, #1723, #1830, #1879, #1891, #1937, #1986)

SDK 2.0 [2011-12]

  • Breaking Changes:
    • Removed the following deprecated APIs (See SDK 1.9 Release Notes): (#1739)
      • GX2TempConvertColorSurfaceToDepthBuffer. See GX2 AA and HiZ Conversion APIs.
      • GX2BeginOcclusionQuery, GX2EndOcclusionQuery, GX2BeginConditionalRender, GX2EndConditionalRender, GX2GetOcclusionQueryReady, GX2GetOcclusionQueryZPassCount. Please use the new Query APIs instead.
      • GX2SetRasterizeControl and the associated Reg functions. Please use GX2SetPolygonControlReg.
      • GX2TempCopyDisplayList. Please use GX2CopyDisplayList.
      • GX2TempSetDBRenderControl.
    • GFD was modified, so you must re-compile all shaders and textures.
  • Surface Copy Changes Part 1: See GX2 Surface Copying, Resizing, Re-Formatting, etc. (#1731):
    • Added support in GX2CopySurface to accept BC/DXT destination buffers (#1615)
    • Added sample demonstrating how to copy surface rects + AA + format conversion (#1714)
    • GX2CopySurface support for stretching and format conversion has been deprecated and will be completely removed in the next SDK. In this SDK a warning message will be printed. See GX2 Surface Copying, Resizing, Re-Formatting, etc. for alternatives/solutions.
  • Features:
    • GX2R APIs: Provides formalized support for GPU resources via the GX2RBuffer struct, with the aim of providing increased reliability, new runtime diagnostics, and facilitating future GPU debug and performance tools. These APIs are a preview and may change before final release. Your feedback is welcomed to help refine the final release. See GX2R GPU Resource APIs and Resource Management Layer API list (#1537)
    • Utility APIs: A new API sitting above GX2+GX2R. Currently, it is mostly simple ease-of-use wrappers, however future releases with provide more sophisticated helper functions which can be used stand-alone, independent of the DEMO framework. It is open source.
    • Added GX2GetSurfaceSwizzle (#1683)
    • Added GX2GetCurrentDisplayList (#1787)
    • Added a warning if too many GPU interrupts are firing per frame. See GX2SetInterruptCountLimit (#1829)
    • Added a simple stream-out demo: test/gx2/streamout/collisionSO/ (#1665)
    • Added document about Texture Fetch 4 (Texture Fetch4) and demo (test/gx2/shading/PCFShadow). Fetch4 allows the fetching of four unfiltered neighboring texels (2x2 texel block) in a single texture instruction. It will have a great performance advantage in case of sampling a single channel texture in many times (e.g PCF shadowing). (#1549)
  • Bug fixes:
    • The DEMO framework was waiting for the GPU to be idle before each frame. A minor change in DEMOGfxBeforeRender allows the CPU to run ahead of the GPU. You should make sure that your own game engine is pipelining the two as expected. (#1793)
    • Fixed a deadlock issue when using GPU wait semaphore in DMAE (#1820)
    • GX2 can now be re-initialized on a different core with GX2Shutdown and GX2Init (#1724)
    • Fixed all known bugs related to using GX2CopySurface() to copy 2D array and 3D surfaces with mip-maps, tiling, and swizzles. Fixed various bugs related to copying slices/mips from 3D surfaces. Disabled linear filtering which prevents GX2CopySurface from blending two slices together. Fixed corruption when copying "THICK" tiled surfaces (i.e. GX2_TILE_MODE_2D_TILED_THICK) which is mainly used with 3D surfaces. (#1283)
    • GX2_DEBUG_MODE_FLUSH_PER_DRAW no longer causes an assert when display lists are being created. (#1760)
    • Fixed warning #228 when some GX2 headers are compiled with examplemake by adding extern "C" in gx2Uda.h.
  • Performance:
    • Added testware and documentation (GX2 Perf Counter APIs) for new Performance API. (#1607)
      • demo/gx2/garden/shallowWater
      • test/gx2/perf/benchmark
    • Added documentation and GX2CopyEndianSwap on how to swap Uniform Blocks most efficiently: Uniform Block Management (#1713)
    • Enhanced Performance API (Perf):
      • Added switch to disable frame coherence check option
      • Added const correctness
      • Added get functions for the set functions
      • Fixed several bugs
  • Tex Utils:
    • Added support for GX2_SURFACE_FORMAT_TCS_R8_G8_B8_A8_SRGB, GX2_SURFACE_FORMAT_T_BC1_SRGB, GX2_SURFACE_FORMAT_T_BC2_SRGB and GX2_SURFACE_FORMAT_T_BC3_SRGB format. (#1675)
    • Bug fix: Support for FourCC DDS images. (#1751)
    • Bug fix: Support for converting DDS when setting DDSD_MIPMAPCOUNT flag and dwMipMapCount=0.
    • Bug fix: TexUtils lost mipmap images with 3D texture slices (#855)
    • Bug fix: TexUtils failed to convert mipmap images in cubemap and 3D texture in case of BC formats. It ignored mip offsets in each faces in the case. (#1132)
    • Bug fix: TexUtils bug in TC2CombineAsMips to handle mips down to 1x1 for non-square textures. It did not care about minimum size of mips. (#1901)
  • Shader Utils:
    • Added support for explicit hardware location binding for Uniform Blocks: Explicit Hardware Location Binding (#1065)
    • Bug fix: Uniform Block's member offset is different when member is array. (#1573)
  • GFD: (#1757, 1758)
  • Deprecated APIs:
    • All deprecated APIs will be removed in the next SDK! For a complete listing, see Deprecated.
    • Deprecated previous performance counter APIs. Please use the new Perf APIs. This deprecation was delayed one SDK because of bug fixes.
  • Documentation:
  • Minor fixes (#584, #1470, #1506, #1733, #1831, #1893)
  • Known Issue
    • Part of the SDK itself, root.rpx, is using GX2CopySurface() to do stretching and triggering a warning messages since this behavior is now deprecated. This will be fixed in the next SDK and can safely be ignored.

SDK 1.9.1 [2011-11]

  • Bug fixes:
    • Fixed warning in gx2Misc.h built on visual studio. (#1766)

SDK 1.9 [2011-10]

  • Breaking Changes:
  • Deprecated APIs:
    • All deprecated APIs will be removed in the next SDK! For a complete listing, see Deprecated.
    • Deprecated GX2TempConvertColorSurfaceToDepthBuffer. See GX2 AA and HiZ Conversion APIs. (#726)
    • Deprecated previous performance counter APIs. Please use the new Perf APIs. However, note that this API is subject to change. (#1042)
    • Deprecated previous Query APIs: GX2BeginOcclusionQuery, GX2EndOcclusionQuery, GX2BeginConditionalRender, GX2EndConditionalRender, GX2GetOcclusionQueryReady, GX2GetOcclusionQueryZPassCount. Please use the new Query APIs instead.
    • Deprecated GX2SetRasterizeControl and the associated Reg functions. Please use GX2SetPolygonControlReg. (#1624)
    • Deprecated GX2TempSetDBRenderControl.
    • Deprecated GX2TempCopyDisplayList.
  • Performance Enhancements:
    • Added preliminary version of runtime performance counter API: (#1042)
      • Documentation and demos will be provide in the next SDK.
      • This API is subject to change.
      • See API the Perf reference manual.
    • Fixed bug that was causing GPU7 to run half rate when rendering to float formats with 16 bits or less per component ( GX2_SURFACE_FORMAT_TC_R16_G16_FLOAT, GX2_SURFACE_FORMAT_TC_R11_G11_B10_FLOAT, etc.) (#1421)
    • Optimized multithread usage of shaderUtils.dll (#1505)
  • Bug fixes:
    • Fixed incorrect GPU7 performance counters. (#1535)
    • Fixed bug where GX2CopySurface didn't work with slices greater than zero (#1283)
      • We think we may have fixed the issue, but we are still investigating the corner cases.
    • Fixed shaderUtils to resolve strange flickering occurs on a certain shader. (#1637)
    • Fixed GX2Invalidate to invalidate none 256 byte aligned data properly. (#1679)
    • Fixed bug where GX2SetDRCConnectCallback unconditionally invokes callback, even when NULL(#1612)
    • Fixed bug where the system fails to set HDMI output depends on TV status(#890)
      • This bug is preventable by specifying AV output port through devmenu. Please refer to the system settings man page for more detail.
    • Fixed memory corruption in the GSH2CompileProgram function (#1697)
  • Enhancements:
  • Minor fixes (#350, #652, #1235, #1501, #1502, #1522 #1544, #1558, #1576, #1602, #1605, #1611)

SDK 1.8 (includes SDK 1.7.1) [2011-09]

SDK 1.7 [2011-07-15]

  • Breaking Change: If you are using the GPU7 endian bug fix version of the SDK, then please see GX2EndianSwapWorkAroundTransitionSect (#572)
  • Breaking Change: Changed GX2Init input parameters from (argc, argv) to a list of [argument, value] pairs. For an example, see the source to DEMOGfxInit (#911)
  • DRC-related Changes:
  • Performance Enhancements:
    • Increased vertex reuse buffer from 6 to 14 vertices, which should help increase vertex performance for primitives that repeat the same vertex within this limit. This was a software bug that wasn't taking advantage of the hardware's maximum value of 14. (#1125)
    • Fixed the guard-band setup in GX2SetViewport to reduce clipping. A clipped primitive costs 24x-170x more than a non-clipped one (depending upon the number of clipping planes crossed). (#1129)
    • Modified DEMOGFDReadVertexShader, DEMOGFDReadPixelShader, and DEMOGFDReadGeometryShader to allocate the GSH header portion to the default system heap instead of GX2TempAllocMEM2. Since GX2TempAlloc allocates uncached memory, the shader setup GX2 calls took longer to process. Please check your applications for a similar bug. Since shader setup calls are very common, this could significantly help improve performance. (#1112)
  • Added documentation for newly discovered GPU7 hardware bugs:
    • Quad Pipe Synchronization Issue (#970)
    • Unprotected FIFO Issue (#871)
  • Shader Compiler Tools (GX2 Tool: Shader Compiler):
    • Breaking Change: Removed legacy gshConvert.exe and aticl.exe from SDK, and replaced with gshCompile.exe, which is based on a new Shader Utility Library that can be integrated into other PC applications. See GX2 Tool: Shader Compiler (#907)
    • Added GLSL Cafe extension for Explicit Hardware Location Binding GX2 GLSL Cafe Extensions (#604)
    • Removed duplicated program data sections from GFD_BLOCK_TYPE_GX2_*SH_HEADER blocks in GSH files. (#907)
    • Added support for independent vertex and pixel shader compilation. Previously, every pixel and vertex shader pair needed to be compiled together, which creates requires (up to) V x P compilations. This feature allows for V + P compilations given that certain requirements are met. See Independent Shaders for more details. (#952, #975)
    • Added a feature to append a shader to an existing GSH file Command Line Arguments (#678, #921)
    • Fixed register mismatch when using gl_FragCoord (#749, #943)
    • Fixed inconsistent shader mode settings when compiling geometry shaders (#1100)
    • Fixed shader loops with greater than 255 iterations (#1116)
    • Fixed temp loop values. Previously, loop temporary values needed to be decremented (–i vs. ++i). This fix includes slightly better performance for loops in general. (#512)
    • Fixed error with size calculation of Uniform Blocks containing an array. (#1188)
    • Removed code that initializes the arrays in the GLSL compiler. This eliminates unnecessary register writes, which increases performance. The GLSL compiler will no longer support reading a temp value before it is written. (#1137)
    • Added a "-oh" option that generates a header file instead of a GSH file, which can be used to debug shader output in a programmer-readable format, or to compile shaders into an application for special purposes (#919, #888)
    • Turned GFD into a DLL for convenience. (#1151)
    • Moved file locations File Locations (#1050)
  • Texture Converter Tools (GX2 Tool: Texture Converter 2):
  • GX2 Robustness:
    • Fixed a bug that could hang GX2 if calling GX2WaitForFlip within a few microseconds of the vsync. Now this function will no longer hang, but if called within the same window it will sleep until the next vsync. For a recommendation of how to synchronize with vsync, please see GX2 Display Synchronization. (#1001)
    • Added functionality to print out GPU7 status registers if a GPU hang is detected. See GX2PrintGPUStatus, GX2SetGPUTimeout, and GX2GetGPUTimeout (#1052)
  • Updated Performance Counters: (#1042, #1055)
  • Update matrix library:
  • Added support for compressed format surfaces as a source for GX2CopySurface (#1158)
  • Added a new structure-based interface to the render state APIs. See Register (#697)
  • Removed GPU7 commands from GX2InitFetchShader, which makes it safe to call outside of a GX2 context, from any core or at any time. (#1040)
  • Fixed graphics-related compiler errors when including cafe.h in Windows builds (#895)
  • Added asserts and some documentation to catch MSAA differences between depth and color buffers in GX2ClearBuffers. (#1014)
  • Removed unnecessary GX2Flush in GX2SetClearDepthStencil so the function can be called in a display list (#1127)

SDK 1.6 [2011-05-04]

  • Updated Shader Compiler, aticl.exe, to output .gsh files directly. In previous SDKs, gshConvert.exe would call aticl.exe to compile a shader into an ELF file that gshConvert.exe would then parse into a .gsh shader file. In SDK 1.6, aticl.exe generates .gsh files directly. (#428)
    • Warning: gshConvert.exe will be deprecated in the next SDK!
  • Added Preliminary Geometry Shader Support (See GX2 Shader APIs). Demos can be found at system/src/test/gx2/gs.
  • Fixed a critical bug where GX2 was allocating too large of a buffer for the GPU7 interrupt handler which could cause the GPU7 to start reading garbage data. This bug was causing instability in the graphics system in some cases. (#869)
  • Fixed a critical bug where GX2CalcFetchShaderSize was not computed correctly. This could result in the fetch shader buffer being too small and either the CPU overwriting other data or the GPU reading incorrect data.
  • Fixed a critical bug where the write pointer to a ring buffer that tracks interrupts between the CPU and GPU became out of sync between the two. This would cause the vsync interrupt to get lost, and applications would hang in GX2WaitForVsync. (#920)
  • Converted texUtils from a static library to a DLL to help support Visual Studio 2010 and 2008 (see GX2 Tool: Texture Converter 2.) (#842)
  • Updated documentation about Endianess & Alignment for Each Buffer Type. (#835, #754)
  • Graphics Memory Management Changes:
    • Added Memory Management APIs to help transition to final hardware which fixes a hardware issue. Please do not use GX2TempAllocMEM1 and GX2TempAllocMEM2 directly. Use DEMOGfxAllocMEM1 and DEMOGfxAllocMEM2 instead. If you aren't use the demo library, we strongly recommend copying the functionality of these two functions to aid in the transition to final hardware (in a future SDK). (#572)
    • Added APIs to notify GX2 of GPU7-bound data: GX2NotifyMemAlloc, GX2NotifyMemFree. If you override the default system allocator, we strongly recommend using these functions to inform GX2 of graphics related data. In a future SDK, we plan to support the ability to capture and replay graphics commands to help debug graphics-related issues and analyze performance. (#572)
    • Updated documentation for GX2Invalidate. The DEMOGfxAlloc* (and GX2TempAlloc*) APIs currently allocate non-cached memory. So, not using GX2Invalidate properly will not cause issues. However, in a future SDK that only supports final hardware, calling GX2Invalidate will be essential. We strongly recommend to observe proper usage of GX2Invalidate to help transition to final hardware (in a future SDK).
  • Modified DEMOCaptureCopy to support Host File IO. Passing a filename into DEMOCaptureCopy now works and will write the captured image to the file specified. If the filename doesn't start with '/', then it is saved under /vol/content (which normally maps to $CAFE_CONTENT_DIR), else an absolute pathname is expected (such as "/vol/save/capture.tga", putting "capture.tga" under $CAFE_SAVE_DIR). (#817, #878)
  • Added support for GX2_SURFACE_FORMAT_T_BC5_SNORM and GX2_SURFACE_FORMAT_TC_R8_G8_UNORM to TexConv2.exe (#738, #739)
  • Added support for GX2_SURFACE_FORMAT_T_NV12_UNORM to GX2. (#638)
  • Fixed support for GX2_INDEX_FORMAT_U16 in GX2DrawIndexedImmediate which caused the GPU to hang (#810)
  • Fixed various bugs in aticl.exe related to switching to the new standard template library: stl70 (#868)
  • Fixed support for GX2_SURFACE_FORMAT_TCS_R8_G8_B8_A8_SRGB (#827)
  • Removed some surface formats because of poor performance or utility: (#821)
  • Updated DEMOGFDFreeVertexShader, DEMOGFDFreePixelShader and DEMOGFDFreeTexture to accept NULL (#882)
  • Added testware that generates a 3D texture at runtime in system/src/test/gx2/shading/noise3D (#838)
  • Added convenience functions: GX2GetSurfaceFormatBits, GX2GetAttribFormatBits, DEMOGfxGetSurfaceFormatName, and DEMOGfxGetAttribFormatName.

SDK 1.5 [2011-04-22]

  • Added 64-bit version of the txUtil libraries (GX2 Tool: Texture Converter 2) (#494)
  • Added the surface format GX2_SURFACE_FORMAT_T_R24_UNORM_G8_UINT. This format is only valid for GX2_SURFACE_USE_TEXTURE. If you render to a GX2_SURFACE_FORMAT_D_D24_S8_UNORM depth buffer, then you must use GX2ConvertDepthBufferToTextureSurface to change the color/depth tiling format before you access it as a texture. You may only access the depth (R24_UNORM) or stencil (G8_UINT) components one at a time, since each requires a different sampler type. (#763)
  • Fixed bug that restricted the GX2 main thread to operate to core 1. Now GX2Init can be called from any core (see GX2 Management APIs) (#730)
  • Fixed bug where memory addresses < 256MB were not compatible with GX2
  • Updated shader compiler (aticl.exe)
    • Replaced debug with release executable
    • Fixed some compile-related bugs (#752, #789)

SDK 1.4 [2011-03-31]

  • Breaking Change: Updated surface format definitions: GX2SurfaceFormat
    • The description of the surface format layout in memory has changed to correspond better to the actual hardware layout. Previously, a format such as GX2_SURFACE_FORMAT_TC_R5_G5_B5_A1_UNORM was said to fill in a 16-bit word starting from the MSB end. Now, it is said to fill in the word starting from the LSB end. The main difference is with 5_5_5_1/1_5_5_5 and 10_10_10_2/2_10_10_10 formats, which are reversed from what they were before. Other packed formats, such as GX2_SURFACE_FORMAT_TC_R4_G4_B4_A4_UNORM, were only described incorrectly before. Formats which have components that are 8, 16, or 32 bits in size are unaffected.
  • Added APIs to handle using MSAA color buffers or MSAA/HiZ depth buffers as textures: GX2ResolveAAColorBuffer, GX2ExpandAAColorBuffer, GX2ExpandDepthBuffer, GX2ConvertDepthBufferToTextureSurface, GX2TempConvertColorSurfaceToDepthBuffer
  • TexConv2.exe Updates (GX2 Tool: Texture Converter 2)
    • Added "-swizzle" option (#506)
    • Added "-bc1alpha" option to specify alpha conversion thresholds for BC1 (#665)
    • Added "-printinfo" option to print useful information such as texture alignment and size
    • Added support to write DDS files from GX2 files
    • Fixed all reported mip-map related bugs (#628)
  • Added new attribute stream formats that allow conversion of integer-type data to floating point (without normalization) during fetch. This applies to all formats with 8 or 16-bit components only. (#705)
  • Fixed support for swizzling textures to improve texture look-up performance for textures being accessed at the same time (see GX2 Texture Tiling) (#506)
  • Created new structure-based interface to the matrix library (MAT)
  • Fixed perspective/orthographic matrix to match OpenGL's expected near and far values (-1 <= z <= 1) (#686)
  • Fixed shader compiler bug when a Uniform Block is used in multiple functions, but the first function is optimized out (#680)
  • Fixed two performance counters that were returning incorrect results: GX2_PERF_U64_VS_VERTICES_IN, GX2_PERF_U64_PS_PIXELS_IN (#681)
  • Fixed latticeDL to call GX2Invalidate appropriately. This is a common gotcha! (#682)

SDK 1.3 [2011-03-09]

  • Breaking Change: Changes have been made to the texture utilities library that require a user to call TexUtils::TC2Initialize() before using other API calls.
    • Other API calls made before this initialization will fail.
    • There is also a corresponding TexUtils::TC2Destroy() call that should be made after the last API call to shutdown the library.
    • There is an example of this call being made in the main function in TexConvert.cpp.
    • As a result of this initialization, the gpu parameter has been removed from existing API calls since it is specified at initialization.
  • Breaking Change: The GX2SurfaceFormat enum now encodes the component channel mapping into the name of the enum.
    • For example: GX2_SURFACE_FORMAT_TCS_8_8_8_8_UNORM is now GX2_SURFACE_FORMAT_TCS_R8_G8_B8_A8_UNORM.
  • Breaking Change: The .w values of gl_FragCoord now have the pixel "w" value, not "1/w"
  • The -DEMO_CB_FORMAT command line option for the DEMO library no longer changes the scan out buffer format.
    • The new -DEMO_SCAN_FORMAT flag can be used to change the scan out buffer format.
  • Shader compile error with "gl_FragCoord" has been resolved. (#551)
    • The .x and .y values used with gl_FragCoord have an origin in the top left corner.
  • TexConv2.exe now supports using a -align option which will properly align all data within the file to allow the file to be loaded into memory at up to 64KB alignment and use image pointers directly.(#608)
  • Using the display formats 10_10_10_2 and 2_10_10_10 now work correctly with DEMOGfxInit (#617)
  • Context states now correctly shadow shader state from GX2SetPixelShader and GX2SetVertexShader (#641)
  • Added GFD APIs for PC tools.
    • Please refer to GFD Overview documentation for more information.
  • Several issues were fixed for the texture converter (GX2 Tool: Texture Converter 2)
    • Mipmap generation for passthrough formats now works correctly
    • Fixed crashing issue for mipmapped cubemaps (partial #455)
    • Fixed padding issue for linear textures
    • Fixed issue with non-power of 2 textures
  • The texture converter (and library) now process alpha when a format with alpha is converted to BC1. (#665)
    • If the alpha value is less than 127, the corresponding texel becomes transparent.
  • BC5 textures now have their components in the opposite order from SDK 1.2 or earlier when using the texture converter
    • The present order is correct for BC5.
    • The previous order was correct for ATI2N (a variant of BC5).
    • The present order also corresponds to "ATI2N with alternate XY swizzle."

SDK 1.2 [2011-02-23]

  • Enabled Host IO to move graphics assets from the PC to Cafe
  • GPU clock has been changed for stability
    • Future hardware will restore the clock
  • Improved Command Buffer Management
    • Moved from fixed size Command Buffers to variable size Command Buffers
    • Enabled Write Gatherer (WG) to write commands into Command Buffers
      • Important: This change requires that users allocated memory for Display Lists to come from the default memory allocator (not GX2TempAlloc*)
      • Important: WG usage is restricted to GX2 for this release, a future release will demonstrate how to use WG for non-GX2 purposes.
    • Improved CPU performance by reducing the number of calls to OSGetCoreId()
  • Matured Texture Converter (GX2 Tool: Texture Converter 2)
    • Added more input/output formats are supported
    • Added TGA support
    • Fixed mipmap for 3D texture (#455)
    • Fixed "-minmip 16" option for 32x32 dds texture (#455)
  • Simplified and separated GX2 file format API from DEMO library to GFD library (GFD Overview)
  • Updated The GLSL Compiler
    • Fixed support for offset of Uniform in UniformBlock with layout(std140) (#495)
    • Fixed support for up to 32 attributes (was 16) (#521)
  • Updated Video APIs (GX2 Display APIs)
    • Added fully pipelined and adjustable video synchronization between the CPU, GPU, and display controller
    • Added a swap interval parameter (removed busy-waiting for frames)
  • Added CPU Access to Tiled Surfaces via HDP (GX2AllocateTilingAperture, GX2FreeTilingAperture, see GX2 Texture APIs)
  • Added a parameter to GX2SetColorControl API to disable drawing to the color buffer for Z only rendering to run at full fillrate.
  • Fixed Attribute Format support (GX2AttribFormat), and added example geometry/format (#496)
    • Known issue: 10_10_10_2_UINT is not working well
  • Fixed DEMO library to allow 1080 resolution again
  • Added proper support for GX2SetTVGamma (#394)
  • Fixed problem where Uniform Block consumed more memory than required (#574)

SDK 1.1 [2011-01-31]

  • Enabled dual export from the shader pipe, which allows the shader export block to pack two quads into each export to the render backend. This feature is only enabled if depth is not output from the shader. It's also not available for multiwrite (color 0 goes to all MRTs), SNORM/UNORM formats greater than 11 bits, FLOAT formats greater than 16 bits, or any SINT/UINT format. Under these conditions, GX2 will render 2X fill performance than previously. We recommend using 4QP to get maximum fill rate performance.
  • Improved 720p AA performance by creating a large enough heap in MEM1 to fit color and depth buffers
  • Changed the default number of Quad Pipes enabled by GX2 from 3 to 2 for stability (GPU7 Hardware Overview)
  • Improved overall performance of GX2 by optimizing Command Buffer Management implementation
  • Updated TexConv2.exe
    • General API clean-up
    • Added TGA support
    • Added texture array support
  • Removed unnecessary calls to MTXTranspose in tests/demos, and reversed matrix multiplication order in shaders instead. (Optimization)
  • Changed GX2SetDebugDraw to GX2SetDebugMode and added two features:
    • Infinitely fast hardware
    • Waits on all flushes not just the draw flushes

SDK 1.0 [2010-12-22]

  • Updated texture tools
    • Added a new texture converter with a simpler design that can be easily integrated into tool chains as libraries
    • Added 3D texture support
    • Added Cube map support
  • Display List support
    • Create and call Display Lists at run-time
    • Enables early Multi-Core rendering schemes
  • MSAA Support
  • Conditional Rendering depending on Occlusion Query
  • Dual Source Blending
  • Instancing Support
  • Improved Draw APIs
  • Improved Performance APIs
  • Faster surface clears and copies
  • Faster z-buffering
  • Updated Shader Compiler (no verified changes)
  • SPI Hang Workaround
  • General API Clean-up
    • Many functions have been changed to create a more consistent API
    • Const Correct APIs

Pre-Release Milestone 2 [2010-10-29]

  • Context Switching (Hardware state shadowing)
  • New Draw Calls
  • Performance API & Tests
  • Updated Texture API
    • Expanded texture sampler APIs
    • Expanded surface APIs to include views and mip-mapping
    • Added Color channel APIs (rearranging channels)
  • Switched to new video library
    • Requires use of HDMI. VGA connection shows wrong colors.
  • Synchronization APIs (including GX2DrawDone)
  • Updated Demo Library
    • Color Buffer Capture Utility
    • Demo API consistency
  • Alpha Test API & demo
  • Cafe/trunk synchronization & Automated test suite
  • General API Clean-up
  • Removed GX2's dependency on demo library
  • Updated Clear APIs
  • Updated converter tools
    • Requires update of all assets (textures & shaders)

Pre-Release Milestone 1 [2010-09-28]

  • Initial release with basic functionality

CONFIDENTIAL