Comments (5)
Thanks for the detailed bug report.
I'm not sure I see the error here. Wolfram alpha is returning [538, 612, 686, 760]
, which seems to be the exact same result that the same code generates when using M000
. The results in result_transposed
differ from what wolfram returns.
Is this not the expected behaviour?
from rust-psp.
So I should clarify, the
assert_eq!((result.x, result.y, result.z, result.w), (538.0, 612.0, 686.0, 760.0));
assert_eq!((result_transposed.x, result_transposed.y, result_transposed.z, result_transposed.w), (190.0, 486.0, 782.0, 1078.0));
lines fail.
The output from the first assertion is:
panicked at `assertion failed: `(left == right)`
left: `(190.0, 486.0, 782.0, 1078.0)`,
right: `(538.0, 612.0, 686.0, 760.0)`, src/main.rs:125:5
from rust-psp.
Ah I see. Interestingly, I fed the same assembly through the original GCC toolchain (which was derived from Sony), and there was no difference in the output.
ScePspFMatrix4 matrix = {
{ 1.0, 2.0, 3.0, 4.0 },
{ 5.0, 6.0, 7.0, 8.0 },
{ 9.0, 10.0, 11.0, 12.0 },
{ 13.0, 14.0, 15.0, 16.0, },
};
ScePspFVector4 vector = { 17.0, 18.0, 19.0, 20.0 };
ScePspFVector4 result;
__asm__ volatile (
"lv.q C000, 0 + %1\n"
"lv.q C010, 16 + %1\n"
"lv.q C020, 32 + %1\n"
"lv.q C030, 48 + %1\n"
"lv.q C100, %2\n"
"vtfm4.q C110, M000, C100\n"
"sv.q C110, %0\n"
: "=m"(result) : "m"(matrix), "m"(vector) : "memory"
);
As it turns out, it seems that the transposed registers E___
are designed for column-mode operations. So the assembler generation is actually correct, you just have to use E000
because you are doing a column operation. M000
in this case is actually the transposed matrix.
It may seem counter-intuitive because you are using vtfm4.q
with M000 * C100
, but this is just a notational trick; you can just as easily do M000 * R100
if you stored vector
into R100
. The assembler doesn't care, the instruction always assumes the vector operand is a column vector, and thus the matrix operand can be transposed either way.
I wasn't aware of this myself. Should probably add more tests...
BTW: your snippet has some undefined behavior (taking a &mut
reference on uninitialized data):
let mut result: psp::sys::ScePspFVector4 = unsafe { core::mem::MaybeUninit::uninit().assume_init() };
You can instead do:
let mut result = core::mem::MaybeUninit::uninit(); // No unsafe necessary
Then you can write into the result with result = in(reg) (result.as_mut_ptr())
, and finally .assume_init()
after the assembly executes.
from rust-psp.
Woah that's so confusing! Thanks for taking the time to look into this though!
But this should mean that the documentation at http://hitmen.c02.at/files/yapspd/psp_doc/chap4.html saying that E000
is the transpose matrix is wrong? I can't see how this wouldn't apply to the general case?
Thanks for the MaybeUninit
thing, I haven't used it before.
from rust-psp.
Indeed the description is off. It would perhaps be better to say that the E___
registers represent column-major matrices, while M___
registers represent row-major matrices.
It would then make sense that choosing e.g. M000
in vtfm4.q
is actually the transpose of the data, because matrix
is accessed by E000
as a column-major matrix. (Accessing a column-major matrix in row-major order is a transposition).
Operating under this definition, sv.q
/ lv.q
therefore must work with row-major matrices, which is why you use M___
when loading/storing vectors. The shorthands C___
and R___
are just notation that you use when working with a row-major matrix. R___
is packed tightly, but C___
has a stride. The rule of thumb then should be do everything in row-major order until you need to access it differently in an instruction like vtfm4.q
.
This is why using M___
like normal for loading/storing works in your example, but we have to use E___
when actually manipulating the data, to make sure it is interpreted in column-major format. For what it's worth, the only instructions that this matters for in the assembler are vtfm_
and vhtfm_
, because nearly every single other operation isn't sensitive to the matrix format (vmidt
, vmzero
, vmmov
).
The one big exception to this rule would be matrix multiplication. If we want to multiply 2 matrices "normally" with column-major order, according to our definition above we would expect to do vmmul.q E000, E100, E200
, but instead we do:
; Column-major multiplication
;
; M000 = M100 * M200
vmmul.q M000, M100, M200
The assembler actually flips the M/E bit to make this happen. I'm not sure why Sony did this. I assume it's because logically applying E___
here would be too confusing for unfamiliar devs writing this common operation, so they just special-cased it. Unfortunately this just made things more complex. Here, the original definition of "transposed matrix" from hitmen.c02.at seems to apply.
EDIT: To make this more strange and subtle, only the M/E bit of the middle register is flipped, but the multiplication order of the two matrices is also flipped while interpreting the instruction. This combination of changes still has the same effect as transposing all registers as described above, so we can pretend that
M___
gives us column-major multiplication.
I guess it somewhat means that each instruction has its own definition of "normal" vs "transposed", but if you use it with this mental model I think it should work fine. There may be more exceptions to this rule, but after playing around and examining the opcode tables, this seems to work as expected.
from rust-psp.
Related Issues (20)
- Add support for missing libm functions HOT 4
- psp-hello-world-example rust-lld: error: --strip-all and --emit-relocs may not be used together HOT 8
- Unable to compile hello world HOT 2
- Core support HOT 6
- psp::math::cosf32 crashes PSP HOT 1
- Problem with rust-lld HOT 4
- cargo psp issue HOT 5
- cargo-psp tries to build the .prx and EBOOT.PBP for a non-PSP binary in a workspace
- How to draw sprites to display without embedded graphics? HOT 2
- cargo psp doesn't seem to work in workspaces HOT 2
- How to send HTTP requests? HOT 3
- Cannot build HOT 1
- builds failed HOT 1
- can't build any program HOT 5
- Reduce usage of nighlty features HOT 3
- Fix Panic/Exception Support HOT 1
- Missing VFPU Instructions
- cannot build HOT 2
- Cannot Build `examples/rust-std-hello-world` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rust-psp.