ARMv8虚拟化从入门到放弃(1) – Minos Hypervisor启动流程

ARMv8-A security model when EL3 is using AArch64

ARMv8虚拟化从入门到放弃(1) - Minos启动流程

视频地址 - https://v.qq.com/x/page/r0948ifd8ec.html

在这里插入图片描述

ARMv8-A security model when EL3 is using AArch32

在这里插入图片描述
在这里插入图片描述

  • minos采用的是aarch64 security model
  • u-boot由secure monitor加载到EL2运行
  • u-boot在el2中从磁盘上加载minos.bin和minos.dtb到指定内存
  • u-boot最后通过booti 命令跳转到minos把控制权交给minos

参考链接

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/architecting-more-secure-world-with-isolation-and-virtualization

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

ARM Cortex-A Series Programmer’s Guide for ARMv8-A

Minos编译及入口地址配置

1. 程序入口地址配置

CONFIG_MINOS_START_ADDRESS - 配置系统内存起始地址

CONFIG_MINOS_ENTRY_ADDRESS - 配置系统程序入口地址

CONFIG_MINOS_ENTRY_ADDRESS = CONFIG_MINOS_START_ADDRESS + NR_CPUS * 8192 (系统默认堆栈大小为8K, 用于idle task的堆栈)

2. LDS文件

arch/aarch64/lds/minos.lds.S

3. minos Image header

为了不修改u-boot,直接使用u-boot的booti命令,需要在起始地址偏移为8字节的地方加上image header

uboot 0x8000 - 0x9000

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
_start:
        /* interrupt disabled mmu/dcache/icache off */
        msr     daifset, #2
        b       do_start
        .quad   CONFIG_MINOS_ENTRY_ADDRESS           /* Image load offset from start of RAM */
        .quad   __code_end - _start                  /* reserved */
        .quad   0                                    /* reserved */
        .quad   0                                    /* reserved */
        .quad   0                                    /* reserved */
        .quad   0                                    /* reserved */
        .byte   0x41                                 /* Magic number, "ARM\x64" */
        .byte   0x52
        .byte   0x4d
        .byte   0x64
        .long   0x0

4. 编译流程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
minos: $(minos-deps) scripts/generate_allsymbols.py
        $(Q) echo "  LD      .tmp.minos.elf"
        $(Q) $(LD) $(minos_LDFLAGS) -o .tmp.minos.elf $(MBUILD_MINOS_INIT) $(MBUILD_MINOS_MAIN) $(MBUILD_MINOS_LIBS)
        $(Q) echo "  NM      .tmp.minos.symbols"
        $(Q) $(NM) -n .tmp.minos.elf > .tmp.minos.symbols
        $(Q) echo "  PYTHON  allsymbols.S"
        $(Q) python3 scripts/generate_allsymbols.py .tmp.minos.symbols allsymbols.S
        $(Q) echo "  CC      $(MBUILD_IMAGE_SYMBOLS)"
        $(Q) $(CC) $(CCFLAG) $(MBUILD_CFLAGS) -c allsymbols.S -o $(MBUILD_IMAGE_SYMBOLS)
        $(Q) echo "  LD      $(MBUILD_IMAGE_ELF)"
        $(Q) $(LD) $(minos_LDFLAGS) -o $(MBUILD_IMAGE_ELF) $(MBUILD_MINOS_INIT) $(MBUILD_MINOS_MAIN) $(MBUILD_MINOS_LIBS) $(MBUILD_IMAGE_SYMBOLS)
        $(Q) echo "  OBJCOPY $(MBUILD_IMAGE)"
        $(Q) $(OBJCOPY) -O binary $(MBUILD_IMAGE_ELF) $(MBUILD_IMAGE)
        $(Q) echo "  OBJDUMP minos.s"
        $(Q) $(OBJDUMP) $(MBUILD_IMAGE_ELF) -D > minos.s

寄存器解析

在虚拟化打开的情况下,Non-secure EL1访问一些系统寄存会返回对应虚拟寄存器的值

MIDR_EL1和VPIDR_EL2

提供PE(处理元件)的标识信息,包括设备的实现者代码和设备ID号。

MPIDR_EL1和VMPIDR_EL2

在多处理器系统中,为调度目的提供了附加的PE标识机制。通常可以通过此寄存器来获取或者确定此PE在整个系统中的cpuid,affinity和cpuid的对应关系需要根据实际cluster上的PE个数来确定

TPIDR_EL2 TPIDR_EL1 TPIDR_EL3

提供一个寄存器用于存储ELN上执行线程标识信息,以用于OS管理。自定义寄存器,可以根据软件需要用来存储系统相关信息

VTTBR_EL2

存储VM stage 2页表的基地址,用于来自非安全EL0和EL1的存储器访问的第2阶段转换。

VBAR_EL2 VBAR_EL1 VBAR_EL3

异常向量表

在这里插入图片描述

SCTLR_ELn

Provides top level control of the system, including its memory system, at EL2.

D4.2.8 :The effects of disabling a stage of address translation

TCR_ELn

Controls translation table walks required for the stage 1 translation of memory accesses from EL2, and holds cacheability and shareability information for the accesses.

SPSR_ELn

Holds the saved process state when an exception is taken to EL2.

ARMv8 内存屏障

Shareability domains

Domain Abbreviation Description
Non-shareable NSH 仅由本地Agent组成的域。永远不需要与其他内核,处理器或设备同步的访问。通常不用于SMP系统。
Inner Shareable ISH 一个域(可能)由多个Agent共享,但通常不是系统中的所有代理共享。一个系统可以具有多个内部共享域。影响一个内部共享域的操作不会影响系统中的其他内部共享域。
Outer Shareable OSH OSHA域几乎可以肯定由多个代理共享,并且很可能由多个内部可共享域组成。影响外部可共享域的操作也会隐式影响其中的所有内部可共享域
Full system SY 整个系统上的操作会影响系统中的所有Agent。所有非共享区域,所有内部共享区域和所有外部共享区域。

Shareable and non-shareable system

Data Memory Barrier (DMB)

The basic functionality of a DMB is as follows:

It prevents reordering of data accesses instructions across itself. All data accesses by this processor/core before the DMB will be visible to all other masters within the specified shareability domain before any of the data accesses after it. It also ensures that any explicit preceding data (or unified) cache maintenance operations have completed before any subsequent data accesses are executed.

The DMB instruction takes two optional parameters: an operation type (stores only - ‘ST’ - or loads and stores) and a domain. The default operation type is loads and stores and the default domain is System. So, in effect DMB is shorthand for DMB SY. All possible combinations of types and domains are legal operations on any processor, even if it does not implement the specific functionality described, and can be substituted internally for any stronger barrier.

DSB

The Data Synchronization Barrier enforces the same ordering as the Data Memory Barrier, but it also blocks execution of any further instructions until synchronization is complete. It also waits until all cache and branch predictor maintenance operations have completed for the specified shareability domain. If the access type is load and store then it also waits for any TLB maintenance operations to complete.

Instruction Synchronization Barrier (ISB)

The Instruction Synchronization Barrier ensures that any subsequent instructions are fetched anew from cache in order that privilege and access is checked with the current MMU configuration. It is used to ensure any previously executed context changing operations (including cp15 operations) will have completed by the time the ISB completed.

1
2
3
4
5
6
7
8
    /*
* Writing to TxStatus triggers a DMA transfer of the data
* copied to tp->tx_buf[entry] above. Use a memory barrier
* to make sure that the device sees the updated data.
*/
    wmb();
    RTL_W32_F (TxStatus0 + (entry * sizeof (u32)),
    tp->tx_flag | max(len, (unsigned int)ETH_ZLEN));
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
P1:
    STR R11, [R1] ; R11 contains a new instruction to stored in program memory
    DCCMVAU R1 ; clean to PoU makes visible to instruction cache
    DSB ; ensure completion of the clean on all processors
    ICIMVAU R1 ; ensure instruction cache/branch predictor discard stale data
    BPIMVA R1
    DMB ; ensure ordering of the store after the invalidation
        ; DOES NOT guarantee completion of instruction cache/branch
        ; predictor on other processors
    STR R0, [R2] ; set flag to signal completion
    DSB ; ensure completion of the invalidation on all processors
    ISB ; synchronize context on this processor
    BX R1 ; branch to new code
   
P2-PX:
    WAIT ([R2] == 1) ; wait for flag signalling completion
    DSB ; this DSB does not guarantee completion of P1’s
        ;ICIMVAU/BPIMVA
    ISB
    BX R1

参考链接

Memory Barriers: a Hardware View for Software Hackers

Memory access ordering - an introduction.

Barrier Litmus Tests and Cookbook.

下期主题

ARMv8虚拟化介绍