asm内联汇编
本文最后更新于:10 个月前
写在开头
本文所描述的内联汇编主要用于算子开发,因此涉及到的知识面相对可能较窄.如果有内容没涉及到,大概是作者还没使用到相关的知识~
文中对内联汇编的一些定义解释会引用GCC Extended-ASM[1]的原文,毕竟翻译过来容易变味,不利于使用者理解
基本格式
1 |
|
第一个asm是关键字,建议使用**__asm__
**而非asm
,原因如下:
The
asm
keyword is a GNU extension. When writing code that can be compiled with -ansi and the various -std options, use__asm__
instead ofasm
(see Alternate Keywords).
**asm的修饰符(qualifiers)**有三个
-
volatile: 告知GCC这里不用优化(就是这里禁用优化)
The typical use of extended
asm
statements is to manipulate input values to produce output values. However, yourasm
statements may also produce side effects. If so, you may need to use thevolatile
qualifier to disable certain optimizations. See Volatile. -
inline:
-
goto:
AssemblerTemplate则用于放置汇编语句,用""
包裹住每一句汇编语句
OutputOperands,这个列表以:
开头,用于放置输出操作数,列表中的各个操作数以,
分隔.输出操作数一般至少用于只写(覆盖).操作数的格式如下:
1 |
|
其中[asmSymbolicName]
是C中变量cvariablename
在汇编中的符号名,一般同名更有利于我们编写汇编,在AssemblerTemplate中可以通过**%[asmSymbolicName]
**来进行调用
不想用符号名的话则按照输出操作数到输入操作数的变量名的顺序,依次为%0
,%1
,…
(cvariablename)
则用于填写我们C中的变量名
constraint约束,是一个字符串常量,指定操作数的约束条件.它需要以=
(覆盖现有值的变量,即只写)或者+
(既读又写时)作为字串开头
Output constraints must begin with either ‘=’ (a variable overwriting an existing value) or ‘+’ (when reading and writing). When using ‘=’, do not assume the location contains the existing value on entry to the
asm
, except when the operand is tied to an input; see Input Operands.
在上面的前缀后,须有一个或多个额外的约束,来描述值所在的位置.比如寄存器的r
和内存的m
.当列出多个可能的位置,则编译器会根据上下文选择最有效的位置.
After the prefix, there must be one or more additional constraints (see Constraints for
asm
Operands) that describe where the value resides. Common constraints include ‘r’ for register and ‘m’ for memory. When you list more than one possible location (for example,"=rm"
), the compiler chooses the most efficient one based on the current context. If you list as many alternates as theasm
statement allows, you permit the optimizers to produce the best possible code. If you must use a specific register, but your Machine Constraints do not provide sufficient control to select the specific register you want, local register variables may provide a solution (see Specifying Registers for Local Variables).
InputOperands用于放置输入操作数,列表中的各操作数以,
分隔,只用做输入用,即只读.它的格式如下:
1 |
|
这里基本上跟OutputOperands一样,但是作为约束的字串常量没有=
或者+
这些prefix,只是用来指示值所在的位置,比如r
或者m
.也可以给出多个可能的位置,让编译器根据上下文选择最有效的位置
Clobbers这个列表中的列表项,要么是寄存器名字,要么是特殊的clobber.每个列表项都是用""
括起来,用,
分隔的字串常量
虽说编译器知道我们会修改输出操作数,但是这其中的内联代码也会用到额外的寄存器,用的时候一些寄存器的值被覆盖掉了,这些被用的寄存器我们要告知编译器
While the compiler is aware of changes to entries listed in the output operands, the inline
asm
code may modify more than just the outputs. For example, calculations may require additional registers, or the processor may overwrite a register as a side effect of a particular assembler instruction. In order to inform the compiler of these changes, list them in the clobber list. Clobber list items are either register names or the special clobbers (listed below). Each clobber list item is a string constant enclosed in double quotes and separated by commas.
而且我们的输入输出操作数可能约束中指定了要用寄存器来存放值,编译器在选择它们要用的寄存器时,不会用clobber列表中提及到的寄存器.因此,clobber中提及的寄存器可以用于汇编代码中任何用途.
When the compiler selects which registers to use to represent input and output operands, it does not use any of the clobbered registers. As a result, clobbered registers are available for any use in the assembler code.
clobber列表中不给用stack pointer寄存器!!
特殊的clobber:
-
cc
:表示汇编代码修改了标志寄存器.某些机器,gcc会将条件代码表示为一个特定的硬件寄存器,cc用来命名该寄存器The
"cc"
clobber indicates that the assembler code modifies the flags register. On some machines, GCC represents the condition codes as a specific hardware register;"cc"
serves to name this register. On other machines, condition code handling is different, and specifying"cc"
has no effect. But it is valid no matter what the target. -
memory
: memory clobber告诉编译器,汇编代码对输入和输出操作数中列出的项以外的项执行内存读取或写入(例如,访问其中一个输入参数指向的内存)。为了确保内存包含正确的值,GCC可能需要在执行asm(内联汇编)之前将特定的寄存器值刷新到内存中。此外,编译器不假设在asm之前从内存读取的任何值在该asm之后保持不变;它会根据需要重新加载它们(根据需要做现场保护)。使用memory clobber有效地为编译器形成了一个读/写内存屏障。The
"memory"
clobber tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters). To ensure memory contains correct values, GCC may need to flush specific register values to memory before executing theasm
. Further, the compiler does not assume that any values read from memory before anasm
remain unchanged after thatasm
; it reloads them as needed. Using the"memory"
clobber effectively forms a read/write memory barrier for the compiler.
两个例子
RISC-V的例子(通过内联汇编实现系统调用):
1 |
|
aarch64的例子(加权和):
1 |
|
参考文件
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!