FLD instruction x64 bit
FLD instruction x64 bit
I have a little problem with FLD instruction in x64 bit ...
want to load Double value to the stack pointer FPU in st0 register, but it seem to be impossible.
In Delphi x32, I can use this code :
function DoSomething(X:Double):Double;
asm
FLD X
// Do Something ..
FST Result
end;
Unfortunately, in x64, the same code does not work.
Did you read about Win64 compatibility in Delphi help ? They tell that there is not 10-bytes
Extended
type in Win64. And that shows that Delphi Win64 does not use FPU (x86). It uses SSE instead. Thus using FPU instructions is problematic. Also be careful when using BAsm x64 - there are bugs that destroy data or even inverse program control flow.– Arioch 'The
Apr 3 '13 at 12:55
Extended
in x86_64 the FPU shouldn't be used unless you need extended precision. SSE is faster and more consistent in its results
– phuclv
Sep 4 at 6:11
3 Answers
3
In x64 mode floating point parameters are passed in xmm-registers. So when Delphi tries to compile FLD X, it becomes FLD xmm0 but there is no such instruction. You first need to move it to memory.
The same goes with the result, it should be passed back in xmm0.
Try this (not tested):
function DoSomething(X:Double):Double;
var
Temp : double;
asm
MOVQ qword ptr Temp,X
FLD Temp
//do something
FST Temp
MOVQ xmm0,qword ptr Temp
end;
>So when Delphi tries to compile FLD X, it becomes FLD XMM0 ... WHAT About this FLD Result !!! why the compiler accepet loading Result .. is this a bug !!
– SMP3
Apr 3 '13 at 13:41
@SMP3 : It turns out that when you do "FST Result" BASM allocates a temporary storage on stack for result and then adds a extra instruction at the end to load xmm0 with this value. I did not know that. See for yourself in disassembly view in debugger.
– Ville Krumlinde
Apr 3 '13 at 16:08
Is this a bug? No. On x64 use SSE and not x87. But you should stop doing asm and let the compiler do the work.
– David Heffernan
Apr 3 '13 at 17:13
Delphi inherite Microsoft x64 Calling Convention.
So if arguments of function/procedure are float/double, they are passed in XMM0L, XMM1L, XMM2L, and XMM3L registers.
But you can use var
before parameter as workaround like:
var
function DoSomething(var X:Double):Double;
asm
FLD qword ptr [X]
// Do Something ..
FST Result
end;
Nice workaround. Limitation though that you cannot pass constant literals such as DoSomething(1.0) or variables declared as Single.
– Ville Krumlinde
Apr 3 '13 at 16:06
@Ville Krumlinde: Indeed, if you need to call function with constant param than in section
const
first declare the constant. :)– GJ.
Apr 3 '13 at 16:27
const
You don't need to use legacy x87 stack registers in x86-64 code, because SSE2 is baseline, a required part of the x86-64 ISA. You can and should do your scalar FP math using addsd
, mulsd
, sqrtsd
and so on, on XMM registers. (Or addss
for float)
addsd
mulsd
sqrtsd
addss
The Windows x64 calling convention passes float/double FP args in XMM0..3, if they're one of the first four args to the function. (i.e. the 3rd total arg goes in xmm2 if it's FP, rather than the 3rd FP arg going in xmm2.) It returns FP values in XMM0.
Only use x87 if you actually need 80-bit precision inside your function. (Instructions like fsin
and fyl2x
are not fast, and can usually be done just as well by normal math libraries using SSE/SSE2 instructions.
fsin
fyl2x
function times2(X:Double):Double;
asm
addsd xmm0, xmm0 // upper 8 bytes of XMM0 are ignored
ret
end
Storing to memory and reloading into an x87 register costs you about 10 cycles of latency for no benefit. SSE/SSE2 scalar instructions are just as fast, or faster, than their x87 equivalents, and easier to program for and optimize because you never need fxch
; it's a flat register design instead of stack-based. (https://agner.org/optimize/). Also, you have 15 XMM registers.
fxch
Of course, you usually don't need inline asm at all. It could be useful for manually-vectorizing if the compiler doesn't do that for you.
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Define "does not work". Does it crash? Does it not compile? Does it not return the expected result?
– Michael
Apr 3 '13 at 11:53