subleq | CPU design and toolchain for a simple computer architecture
kandi X-RAY | subleq Summary
kandi X-RAY | subleq Summary
This terminal demonstrates the JavaScript interpreter running the provided demo program.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Recursively cleans the input string .
- End of jquery .
- Return a new Date .
- get pixel value
- drawing order of node
- sort of Sizzle
- Returns true if it is in the browser .
- inner mode boolean
- ZElement load function
- The top - level function of the KDF2 node .
subleq Key Features
subleq Examples and Code Snippets
Community Discussions
Trending Discussions on subleq
QUESTION
I was engaged with an expert who allegedly has vastly superior coding skills than myself who understands inline assembly far better than I ever could.
One of the claims is that as long as an operand appears as an input constraint, you don't need to list it as a clobber or specify that the register has been potentially modified by the inline assembly. The conversation came about when someone else was trying to get assistance on a memset
implementation that was effectively coded this way:
ANSWER
Answered 2019-Nov-11 at 16:15You are correct on all counts, this code is full of lies to the compiler that could bite you. e.g. with different surrounding code, or different compiler versions / options (especially link-time optimization to enable cross-file inlining).
swap_vbufs
doesn't even look very efficient, I suspect gcc would do equal or better with a pure C version. https://gcc.gnu.org/wiki/DontUseInlineAsm. stosd
is 3 uops on Intel, worse than a regular mov
-store + add rdi,4
. And making add rdi,4
unconditional would avoid the need for that else
block which puts an extra jmp
on the (hopefully) fast path where there's no MMIO store to video RAM because the buffers were equal.
(lodsd
is only 2 uops on Haswell and newer so that's ok if you don't care about IvyBridge or older).
In kernel code I guess they're avoiding SSE2, even though it's baseline for x86-64, otherwise you'd probably want to use that. For a normal memory destination, you'd just memcpy
with rep movsd
or ERMSB rep movsb
, but I guess the point here is to avoid MMIO stores when possible by checking against a cached copy of video RAM. Still, unconditional streaming stores with movnti
might be efficient, unless video RAM is mapped UC (uncacheable) instead of WC.
It's easy to construct examples where this really does break in practice, by e.g. using the relevant C variable again after the inline asm statement in the same function. (Or in a parent function which inlined the asm).
An input you want to destroy has to be handled usually with a matching dummy output or a RMW output with a C tmp var, not just "r"
. or "a"
.
"r"
or any specific-register constraint like "D"
means this is a read-only input, and the compiler can expect to find the value undisturbed afterwards. There is no "input I want to destroy" constraint; you have to synthesize that with a dummy output or variable.
This all applies to other compilers (clang and ICC) that support GNU C inline asm syntax.
From the GCC manual: Extended asm
Input Operands:
Do not modify the contents of input-only operands (except for inputs tied to outputs). The compiler assumes that on exit from the asm statement these operands contain the same values as they had before executing the statement. It is not possible to use clobbers to inform the compiler that the values in these inputs are changing.
(An rax
clobber makes it an error to use "a"
as an input; clobbers and operands can't overlap.)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install subleq
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page