SH4 FTRV Optimizations
Without a doubt, the single most computationally powerful instruction on the SuperH4 CPU in the Sega Dreamcast is FTRV, or the Floating-point TRansform Vector instruction. It is a single instruction which multiplies a 4D vector by the 4x4 matrix held within the back-bank of FPU registers, XMTRX. This article will teach you how to leverage this god instruction for FP performance gainz and introduce you to several example scenarios that have yielded fantastic gainz within the community.
Relationship to FIPR
Format | Function | Encoding | Group | Issue Cycles | Latency Cycles |
---|---|---|---|---|---|
fipr FVm,FVn | inner_product (FVm, FVn) -> FR[n+3] | 1111nnmm11101101 | FE | 1 | 4/5 |
ftrv XMTRX, FVn | transform_vector(XMTRX, FVn) -> FVn | 1111nn0111111101 | FE | 1 | 5/8 |
When to use FTRV
Real-World Examples
The following are real-world examples of FTRV-based optimizations used within games and applications for the Sega Dreamcast within the community.
Vertex Position Transformation
The first and most obvious use of the FTRV instruction is for doing position transform calculations on the incoming vertex stream, transforming from local to view-space, while submitting vertices to the PowerVR during T&L. This is the first and absolute most crucial area for leveraging FTRV and was its original intended purpose. If you do nothing else with the instruction, bear in mind that the only way to come even remotely close to pushing a considerable volume of polygons on the DC is by properly harnessing the SH4 by using FTRV to transform your vertices.