SH4 FTRV Optimizations

From dreamcast.wiki
Revision as of 17:32, 17 July 2025 by GyroVorbis (talk | contribs) (Created page with "Without a doubt, the single most computationally powerful instruction on the SuperH4 CPU in the Sega Dreamcast is '''FTRV''', or the '''F'''loating-point '''TR'''ansform '''V'''ector instruction. It is a single instruction which multiplies a 4D vector by the 4x4 matrix held within the back-bank of FPU registers, '''XMTRX'''. This article will teach you how to leverage this god instruction for FP performance gainz and introduce you to several example scenarios that have y...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Without a doubt, the single most computationally powerful instruction on the SuperH4 CPU in the Sega Dreamcast is FTRV, or the Floating-point TRansform Vector instruction. It is a single instruction which multiplies a 4D vector by the 4x4 matrix held within the back-bank of FPU registers, XMTRX. This article will teach you how to leverage this god instruction for FP performance gainz and introduce you to several example scenarios that have yielded fantastic gainz within the community.

Relationship to FIPR

Instruction Summaries
Format Function Encoding Group Issue Cycles Latency Cycles
fipr FVm,FVn inner_product (FVm, FVn) -> FR[n+3] 1111nnmm11101101 FE 1 4/5
ftrv XMTRX, FVn transform_vector(XMTRX, FVn) -> FVn 1111nn0111111101 FE 1 5/8

When to use FTRV

Real-World Examples

The following are real-world examples of FTRV-based optimizations used within games and applications for the Sega Dreamcast within the community.

Vertex Position Transformation

The first and most obvious use of the FTRV instruction is for doing position transform calculations on the incoming vertex stream, transforming from local to view-space, while submitting vertices to the PowerVR during T&L. This is the first and absolute most crucial area for leveraging FTRV and was its original intended purpose. If you do nothing else with the instruction, bear in mind that the only way to come even remotely close to pushing a considerable volume of polygons on the DC is by properly harnessing the SH4 by using FTRV to transform your vertices.

Diffuse Lighting

Collision and Physics

Bounding Sphere vs View Frustum Culling

ADPCM Decoding