SlideShare a Scribd company logo
from Binary to Binary:
How Qemu Works
魏禛 (@_zhenwei_) <zhenwei.tw@gmail.com>
林致民 (Doraemon) <r06944005@csie.ntu.edu.tw>
August 11, 2018 / COSCUP 2018
Who are we?
● 林致民 (Doraemon)
○ Master Student @ NTU Compiler
Optimization and Virtualization Lab
○ Insterested in compiler optimization and
system performance
● 魏禛 (@_zhenwei_)
○ From Tainan, Taiwan
○ Master student @ NTU
○ Interested in Computer Architecture,
Virtual Machine and Compiler stuff
2
Outline
● Introduction of Qemu
● Guest binary to TCG-IR translation
● Block Chaining !
● TCG-IR to x86_64 translation
● Do not cover ...
○ Full system emulation
○ Interrupt handling
○ Multi-thread implementation
○ Optimization ...
3
● Created by Fabrice Bellard in 2003
● Features
○ Just-in-time (JIT) compilation support ot achieve high performance
○ Cross-platform (most UNIX-like system and MS-Windows)
○ Lots of target hosts and targets support (full system emulation)
■ x86, aarch32, aarch64, mips, sparc, risc-v
○ User mode emulation: Qemu can run applications compiled for another CPU (same OS)
● More excellent slides !
○ Qemu JIT Code Generator and System Emulation
○ QEMU - Binary Translation
What is Qemu
4
Environment
● Guest (target) machine: RISC-V
● Host machine: Intel x86_64
● Tools
○ Qemu 1.12.0 https://www.qemu.org
○ RISC-V GNU Toolchain https://github.com/riscv/riscv-gnu-toolchain.git
5
Translation Block (tb)
● Definition of translation block (tb)
○ Encounter the branch (modify PC)
○ Encounter the system call
○ Reach the page boundary
The picture is referenced from “QEMU - Binary Translation” by Jiann-Fuh Liaw 6
Dynamic Binary Translation
● Translate guest ISA
instuction to Host ISA
instruction (runtime)
● After the translation
block is executed, the
control come back to
the Qemu
7
Dynamic Binary Translation
● Block Chaining - avoid the “context switching” overhead
8
Tiny Code Generator (TCG)
9
It is referenced from tcg/README
Qemu Execution Flow
Find the Translation Block
Execute the Translation Block
Generate the
Translation Block
Chain the generated
Translation Block to
existed Block
10
Qemu Execution Flow
11
tb_find()
gen_intermediate_code()
tcg_gen_code()
tb_add_jump()
cpu_loop_exec_tb()
Data structures would be used later ...
TCGContext TranslationBlock
DisasContext
CPURISCVState 12
The main loop
cpu_exec() @ accel/tcg/cpu-exec.c
● tb_find()
○ Find the desired translation block by pc
value
● cpu_loop_exec_tb()
○ Execute the native code in the
translation block
13
Find the desired Translation Block or create one
tb_find() @ accel/tcg/cpu-exec.c
● tb_lookup__cpu_state()
○ Find the specific tb (Translation Block)
by pc value
● tb_gen_code()
○ If the desired tb hasn’t been generated
yet, we just create one
● tb_add_jump()
○ The block chaining patch point!
○ We will talk about it later ...
14
The Translation Block Finding Algorithm
tb_lookup__cpu_state() @
include/exec/tb-lookup.h
● tb_jmp_cache_hash_func()
○ A level-1 lookup cache is implemented
(fast path)
● tb_htable_lookup()
○ A traditional hash table is used to find
the specific tb by hash value (slow path)
○ Update the level-1 lookup cache if found
it
15
The Translation Block Finding Algorithm
bzip2 mcf
# of tb
executed
32877233 8125325
tb-cache
miss ratio
0.01 % 2.10 %
Some tbs would
be executed
many times
throughout the
program
40% perfmance loss 20% perfmance loss
16
time (sec)
Start to generate the Translation Block
tb_gen_code() @ accel/tcg/translate-all.c
● tb_alloc()
○ This function would allocate the space
from the Code Cache in TCGContext
○ If the Code Cache is full, just flush it !
● gen_intermediate_code()
○ We will get the TCG-IR produced in this
function, which is stored in the
TCGContext
● tcg_gen_code()
○ Generate the host machine code
according to the TCG-IR, and it would be
stored in the tb 17
Generate the TCG-IR first !
gen_intermediate_code() @
target/riscv/translate.c
● The while loop would decode each
instruction in the guest binary until
encounter the branch or reach the
tb size
● decode_opc()
○ Decode the instruction in binary form
18
Decode the instruction in guest binary
decode_RV32_64G() @
target/riscv/translate.c
19
Generate the TCG-IR first ! (E.g. arithmetic instr.)
gen_arith() @ target/riscv/translate.c
● The guest instruction need to be
implemented in the TCG variable
system
● The TCG variables are declared and
can be assigned the value from the
architecture state or constants
● The TCG frontend ops would operate
on these TCG variables. 20
Generate the TCG-IR first ! (E.g. arithmetic instr.)
tcg_gen_? @ tcg/tcg-op.h & tcg/tcg-op.c
● tcg_emit_op()
○ This function will allocate the space from
TCGContext and insert it into the ops
linked-list
● After getting the allocated space, the
tcg opcode and arguments are filled
into it.
● The TCG instrution generated would
be showed later ... 21
More about tcg_emit_op()
22
Generate the TCG-IR first ! (E.g. branch instr.)
gen_branch() @ target/riscv/translate.c
● The Label also needed to generated
via TCG-IR form
● gen_goto_tb()
○ Jump into the specific translation block !
● When encountered the branch, which
means it the end of the tb.
○ The ctx->bstate is set to break the outer
while loop in gen_intermediate_code()
23
How Block Chaining works?
A slot waits for patching.
If not patched yet, it would just
jump to the next instruction
The location of this slot would be
recorded in the tb->jump_target_arg
when generating host machine code
The generating tb will patch the last executed tb24
The Patch Point
Translate TCG-IR to
x86_64 binary code
25
TCG-IR to x86_64 translation
● Before entering the backend …
26
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: Store
tmp0 = tmp0 + 1156(0x484)
Load s2 to tmp0
Load s3 to tmp1
Store tmp1 back to address tmp0How does QEMU generate
binary code?
27
TCG-IR to x86_64 translation
● Let’s take a look at some examples: add
RISC-V 32bit
TCG-IR
x86_64
28
TCG-IR to x86_64 translation
● Let’s take a look at some examples: add
RISC-V 32bit
TCG-IR
x86_64
Architecture States
Data Structure 29
Pointer
TCG-IR to x86_64 translation
● Let’s take a look at some examples: add
RISC-V 32bit
TCG-IR
x86_64
Architecture States
Data Structure 30
General Purpose Registers
Floating Point Registers
TCG-IR to x86_64 translation
● Let’s take a look at some examples: add
RISC-V 32bit
TCG-IR
x86_64
Load data from
architecture state a5 reg.
31
TCG-IR to x86_64 translation
● Let’s take a look at some examples: add
RISC-V 32bit
TCG-IR
x86_64
tmp0 = tmp0 + tmp1
Load data from
architecture state a5
32
TCG-IR to x86_64 translation
● Let’s take a look at some examples: add
RISC-V 32bit
TCG-IR
x86_64
tmp0 = tmp0 + tmp1
Load data from
architecture state a5
Store tmp0 back to
architecture state a5 reg
33
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: load
Load data from address
‘tmp0’ to tmp1
34
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: load
Store tmp1 back to
archi. state a4 reg
Load data from address
‘tmp0’ to tmp1
35
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: Store
Load s2 to tmp0
36
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: Store
tmp0 = tmp0 + 1156(0x484)
Load s2 to tmp0
37
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: Store
tmp0 = tmp0 + 1156(0x484)
Load s2 to tmp0
Load from s3 to tmp1
38
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: Store
tmp0 = tmp0 + 1156(0x484)
Load s2 to tmp0
Load s3 to tmp1
39
Store tmp1 to address tmp 0
TCG-IR to x86_64 translation
RISC-V 32bit
TCG-IR
x86_64
● Let’s take a look at some examples: Store
tmp0 = tmp0 + 1156(0x484)
Load s2 to tmp0
Load s3 to tmp1
Store tmp1 back to address tmp0How does QEMU handle
branch instructions?
40
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Branch
○ Direct Branch
■ Conditional Branch
beqz rs, offset / bgt rs, rt, offset / …..
■ Unconditional Branch
j offset / jal offset / call offset / ...
○ Indirect Branch
■ Switch/Case → Branch table
■ Indirect function call
■ Return Instructions (ret)
41
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Unconditional)
RISC-V 32bit
TCG-IR
x86_64
42
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Unconditional)
RISC-V 32bit
TCG-IR
x86_64
Remind:
Patch point for block chaining
43
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Unconditional)
RISC-V 32bit
TCG-IR
x86_64
Synchronize program counter
to architecture states
44
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Unconditional)
RISC-V 32bit
TCG-IR
x86_64
Synchronize program counter
Prepare return value
45
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Unconditional)
RISC-V 32bit
TCG-IR
x86_64
Synchronize program counter
Prepare return value
Go back to QEMU to find
next Translation Block
46
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Unconditional)
RISC-V 32bit
TCG-IR
x86_64
Remind: Patch point
47
Block Chaining:
Link the current TB to
previous TB by patching
the jump target address
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Conditional)
48
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Conditional)
49
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Conditional)
50
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Conditional)
If the block is chained
by QEMU, jump to
target translation block
51
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Direct Branch (Conditional)
Go back to QEMU to
find next Translation
Block
52
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Indirect Branch - return instruction
RISC-V 32bit
TCG-IR
x86_64
53
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Indirect Branch - return instruction
RISC-V 32bit
TCG-IR
x86_64
54
Store the value from return address
register to program counter
TCG-IR to x86_64 translation
● Let’s take a look at some examples: Indirect Branch - return instruction
RISC-V 32bit
TCG-IR
x86_64
Go back to QEMU to find
next Translation Block in
Program Counter
55
TCG-IR to x86_64 translation
● helper function call
○ QEMU provides a ‘hook’ for developers to write emulation behavior in C
language. Commonly used in:
■ Emulate hardware not supported in host machine
● e.g. Hardware FP, SIMD, AES, etc.
■ Dynamic Instrumentation
● Collect runtime information from source program to analyze
program’s behavior
(e.g. Dynamic call graph / control flow graph)
56
TCG-IR to x86_64 translation
● helper function call - example
call helper_function
return result (%rax) to ‘fa5’ 57
TCG-IR to x86_64 translation
● helper - Emulate IEEE 754 floating point add with double precision
58
TCG-IR to x86_64 translation
● Prologue, epilogue - The entry point for each Translation Block
59
TCG-IR to x86_64 translation
● Prologue, epilogue
○ Decide whether current TB can be executed
60
TCG-IR to x86_64 translation
● Prologue, epilogue
How to execute translated
binary code?
61
x86_64 code execution
● Entry point - qemu-riscv/accel/tcg/cpu-exec.c
62
tcg_qemu_tb_exec: A pointer targeting
to the head of translation block
x86_64 code execution
● Entry point - qemu-riscv/accel/tcg/cpu-exec.c
63
Put env to %rdi, tb_ptr to %rsi
x86_64 code execution
● Entry point - qemu-riscv/accel/tcg/cpu-exec.c
64
Move %rdi to %r14, and jump %rsi
x86_64 code execution
● Entry point - qemu-riscv/accel/tcg/cpu-exec.c
65
Move %rdi to %r14, and jump %rsi
x86_64 code execution
● Entry point - qemu-riscv/accel/tcg/cpu-exec.c
66
Exit code cache and go back to QEMU
x86_64 code execution
● Entry point - qemu-riscv/accel/tcg/cpu-exec.c
67
Return the value to ret, record last TB we just execute.
Reference
● Qemu JIT Code Generator and System Emulation
● QEMU - Binary Translation
● QEMU TCG Frontend Ops
● RISC-V Insturction Set Manual
● Doraemon’s Notes: QEMU Backend
○ Written in Mandarin Chinese
68

More Related Content

What's hot (20)

Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
National Cheng Kung University
 
QEMU is an open source system emulator that uses just-in-time (JIT) compilation to achieve high performance system emulation. It works by translating target CPU instructions to simple host CPU micro-operations at runtime. These micro-operations are cached and chained together into basic blocks to reduce overhead. This approach avoids the performance issues of traditional emulators by removing interpretation overhead and leveraging CPU parallelism through pipelining of basic blocks.
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_
Linaro
 
LCU13: Deep Dive into ARM Trusted Firmware Resource: LCU13 Name: Deep Dive into ARM Trusted Firmware Date: 31-10-2013 Speaker: Dan Handley / Charles Garcia-Tobin
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
The document provides an overview of the initialization phase of the Linux kernel. It discusses how the kernel enables paging to transition from physical to virtual memory addresses. It then describes the various initialization functions that are called by start_kernel to initialize kernel features and architecture-specific code. Some key initialization tasks discussed include creating an identity page table, clearing BSS, and reserving BIOS memory.
Embedded_Linux_Booting
Embedded_Linux_BootingEmbedded_Linux_Booting
Embedded_Linux_Booting
Rashila Rr
 
The embedded Linux boot process involves multiple stages beginning with ROM code that initializes hardware and loads the first stage bootloader, X-Loader. The X-Loader further initializes hardware and loads the second stage bootloader, U-Boot, which performs additional initialization and loads the Linux kernel. The kernel then initializes drivers and mounts the root filesystem to launch userspace processes. Booting can occur from flash memory, an eMMC/SD card, over a network using TFTP/NFS, or locally via UART/USB depending on the boot configuration and available devices.
Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device drivers
Houcheng Lin
 
This document discusses how the Linux kernel supports different ARM boards using a common source code base. It describes how device tree is used to describe hardware in a board-agnostic way. The kernel initializes machine-specific code via the device tree and initializes drivers by matching compatible strings. This allows a single kernel binary to support multiple boards by abstracting low-level hardware details into the device tree rather than the kernel source. The document also contrasts the ARM approach to the x86 approach, where BIOS abstraction and standardized buses allow one kernel to support most x86 hardware.
HKG18-402 - Build secure key management services in OP-TEE
HKG18-402 - Build secure key management services in OP-TEEHKG18-402 - Build secure key management services in OP-TEE
HKG18-402 - Build secure key management services in OP-TEE
Linaro
 
Session ID: HKG18-402 Session Name: HKG18-402 - Build secure key management services in OP-TEE Speaker: Etienne Carriere Track: Security ★ Session Summary ★ The session presents an initiative to build secure key management services in the OP-TEE project. Based on OP-TEE services (persistent storage, cryptography, time, etc) one could build a trusted application of store and use secure keys. An open source implementation for generic key services could be of interest. However there are many client APIs defined in the ecosystem which is a matter of concern for standardization of such services. The session will open a discussion on this and presents the current choice of the PKCS#11 Cryptoki. There can be lot of key attributes and cryptographic schemes to be supported. The session will present the current plans (starting from AES flavors) and what is currently missing in the OP-TEE (as certificate support, bootloader support). This session aims at getting feedback from the community on this topic, discuss about expected services and client APIs. --------------------------------------------------- ★ Resources ★ Event Page: http://connect.linaro.org/resource/hkg18/hkg18-402/ Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-402.pdf Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-402.mp4 --------------------------------------------------- ★ Event Details ★ Linaro Connect Hong Kong 2018 (HKG18) 19-23 March 2018 Regal Airport Hotel Hong Kong --------------------------------------------------- Keyword: Security 'http://www.linaro.org' 'http://connect.linaro.org' --------------------------------------------------- Follow us on Social Media https://www.facebook.com/LinaroOrg https://www.youtube.com/user/linaroorg?sub_confirmation=1 https://www.linkedin.com/company/1026961
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted Firmware
Linaro
 
Resource: LCU13 Name: An Introduction to ARM Trusted Firmware Date: 28-10-2013 Speaker: Andrew Thoelke Video: http://www.youtube.com/watch?v=q32BEMMxmfw
4章 Linuxカーネル - 割り込み・例外 3
4章 Linuxカーネル - 割り込み・例外 34章 Linuxカーネル - 割り込み・例外 3
4章 Linuxカーネル - 割り込み・例外 3
mao999
 
Linuxカーネルの割り込み・例外処理(ハードウェア寄り) ・APIC kernel ver 4.9.16
LCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platformLCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platform
Linaro
 
This document describes how to port the open source Trusted Execution Environment (OP-TEE) to a new platform. It involves cloning the existing platform code, modifying compiler and linker options, configuring platform-specific settings, updating memory mappings, and initializing platform-specific components. The document provides details on each of these porting steps and recommends OP-TEE documentation resources.
Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequence
Houcheng Lin
 
U-boot provides a multistage boot process that initializes the CPU and board resources incrementally at each stage. It begins execution on the CPU in a limited environment and hands off to subsequent stages that gain access to more resources like memory and devices. U-boot supports booting an operating system image from storage like SSD or over the network and offers features like secure boot and hypervisor support.
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
Yan Vugenfirer
 
The document discusses QEMU and adding a new device to it. It begins with an introduction to QEMU and its uses. It then discusses setting up a development environment, compiling QEMU, and examples of existing devices. The main part explains how to add a new "Devix" device by creating source files, registering the device type, initializing PCI configuration, and registering memory regions. It demonstrates basic functionality like interrupts and I/O access callbacks. The goal is to introduce developing new emulated devices for QEMU.
Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8
Linaro
 
LCU14-107: OP-TEE on ARMv8 --------------------------------------------------- Speaker: Jens Wiklander Date: September 15, 2014 --------------------------------------------------- ★ Session Summary ★ SWG is porting OP-TEE to ARMv8 using Fixed Virtual Platform. Initially OP-TEE is running secure world in aarch32 mode, but with the normal world code running in aarch64 mode. Since ARMv8 uses ARM Trusted Firmware we have patched it with an OP-TEE dispatcher to be able to communicate between secure and normal world. --------------------------------------------------- ★ Resources ★ Zerista: http://lcu14.zerista.com/event/member/137710 Google Event: https://plus.google.com/u/0/events/c0ef114n77bhgbns9vb85g9n6ak Presentation: http://www.slideshare.net/linaroorg/lcu14-107-optee-on-ar-mv8 Video: https://www.youtube.com/watch?v=JViplz-ah9M&list=UUIVqQKxCyQLJS6xvSmfndLA Etherpad: http://pad.linaro.org/p/lcu14-107 --------------------------------------------------- ★ Event Details ★ Linaro Connect USA - #LCU14 September 15-19th, 2014 Hyatt Regency San Francisco Airport --------------------------------------------------- http://www.linaro.org http://connect.linaro.org
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
Linaro
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot) Speaker: Bill Fletcher Date: September 24, 2015 ★ Session Description ★ An introductory session of a system-level overview at Power State Coordination - Focus on ARMv8 - Goes top-down from ACPI - A demo based on the current code in qemu - The specifications are very dynamic - what’s onging for ACPI and PSCI ★ Resources ★ Video: https://www.youtube.com/watch?v=vXzPdpaZVto Presentation: http://www.slideshare.net/linaroorg/sfo15tr9-psci-acpi-and-uefi-to-boot Etherpad: pad.linaro.org/p/sfo15-tr9 Pathable: https://sfo15.pathable.com/meetings/303087 ★ Event Details ★ Linaro Connect San Francisco 2015 - #SFO15 September 21-25, 2015 Hyatt Regency Hotel http://www.linaro.org http://connect.linaro.org
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
Adrien Mahieux
 
- The document discusses Linux network stack monitoring and configuration. It begins with definitions of key concepts like RSS, RPS, RFS, LRO, GRO, DCA, XDP and BPF. - It then provides an overview of how the network stack works from the hardware interrupts and driver level up through routing, TCP/IP and to the socket level. - Monitoring tools like ethtool, ftrace and /proc/interrupts are described for viewing hardware statistics, software stack traces and interrupt information.
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMUSFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
Linaro
 
This document discusses moving QEMU's Tiny Code Generator (TCG) to a multi-threaded model to take advantage of multi-core systems. It describes the current single-threaded TCG process model and global state. Approaches considered for multi-threading include using threads/locks, processes/IPC, or rewriting TCG from scratch. Key challenges addressed are protecting code generation globals and implementing atomic memory operations and memory barriers in a multi-threaded context. Patches have been contributed to address these issues and enable multi-threaded TCG. Further work remains to fully enable it across all QEMU backends and architectures.
linux device driver
linux device driverlinux device driver
linux device driver
Rahul Batra
 
Linux device drivers act as an interface between hardware devices and user programs. They communicate with hardware devices and expose an interface to user applications through system calls. Device drivers can be loaded as kernel modules and provide access to devices through special files in the /dev directory. Common operations for drivers include handling read and write requests either through interrupt-driven or polling-based I/O.
Embedded Linux Kernel - Build your custom kernel
Embedded Linux Kernel - Build your custom kernelEmbedded Linux Kernel - Build your custom kernel
Embedded Linux Kernel - Build your custom kernel
Emertxe Information Technologies Pvt Ltd
 
Build your own Embedded Linux Kernel by understanding it source code organization, compilation ecosystem and compiling it for a custom target.
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
Linaro
 
LAS16-111: Raspberry Pi3, OP-TEE and JTAG debugging Speakers: Date: September 26, 2016 ★ Session Description ★ ARM TrustZone is a critical technology for securing IoT devices and systems. But awareness of TrustZone and its benefits lags within the maker community as well as among enterprises. The first step to solving this problem is lowering the cost of access. Sequitur Labs and Linaro have joined forces to address this problem by making a port of OP-TEE available on the Raspberry Pi 3. The presentation covers the value of TrustZone for securing IoT and how customers can learn more through this joint effort. Embedded systems security remains a challenge for many developers. Awareness of mature, proven technologies such as ARM TrustZone is very low among the Maker community as well as among enterprises. As a result this foundational technology is largely being ignored as a security solution. Sequitur Labs and Linaro have taken an innovative approach combining an Open Source solution – OP-TEE with Raspberry Pi 3. The Raspberry Pi 3 is one of the world’s most popular platforms among device makers. Its value as an educational tool for learning about embedded systems development is proven. Sequitur Labs have also enabled bare metal debugging via JTag on the Pi 3 enhancing the value of the Pi 3 as an educational tool for embedded systems development. The presentation will focus on ARM v8a architecture and instruction set ARM Trusted Firmware TrustZone and OP-TEE basics JTAG and bare metal debugging the Raspberry Pi 3 ★ Resources ★ Etherpad: pad.linaro.org/p/las16-111 Presentations & Videos: http://connect.linaro.org/resource/las16/las16-111/ ★ Event Details ★ Linaro Connect Las Vegas 2016 – #LAS16 September 26-30, 2016 http://www.linaro.org http://connect.linaro.org
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
Marian Marinov
 
The document discusses analyzing Linux kernel crash dumps. It covers various ways to gather crash data like serial console, netconsole, kmsg dumpers, Kdump, and Pstore. It then discusses analyzing the crashed kernel using tools like ksymoops, crash utility, and examining the backtrace, kernel logs, processes, and file descriptors. The document provides examples of gathering data from Pstore and using commands like bt, log, and ps with the crash utility to extract information from a crash dump.
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
ScyllaDB
 
Ftrace is the official tracer of the Linux kernel. It originated from the real-time patch (now known as PREEMPT_RT), as developing an operating system for real-time use requires deep insight and transparency of the happenings of the kernel. Not only was tracing useful for debugging, but it was critical for finding areas in the kernel that was causing unbounded latency. It's no wonder why the ftrace infrastructure has a lot of tooling for seeking out latency. Ftrace was introduced into mainline Linux in 2008, and several talks have been done on how to utilize its tracing features. But a lot has happened in the past few years that makes the tooling for finding latency much simpler. Other talks at P99 will discuss the new ftrace tracers "osnoise" and "timerlat", but this talk will focus more on the new flexible and dynamic aspects of ftrace that facilitates finding latency issues which are more specific to your needs. Some of this work may still be in a proof of concept stage, but this talk will give you the advantage of knowing what tools will be available to you in the coming year.
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_
Linaro
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
Embedded_Linux_Booting
Embedded_Linux_BootingEmbedded_Linux_Booting
Embedded_Linux_Booting
Rashila Rr
 
Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device drivers
Houcheng Lin
 
HKG18-402 - Build secure key management services in OP-TEE
HKG18-402 - Build secure key management services in OP-TEEHKG18-402 - Build secure key management services in OP-TEE
HKG18-402 - Build secure key management services in OP-TEE
Linaro
 
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted Firmware
Linaro
 
4章 Linuxカーネル - 割り込み・例外 3
4章 Linuxカーネル - 割り込み・例外 34章 Linuxカーネル - 割り込み・例外 3
4章 Linuxカーネル - 割り込み・例外 3
mao999
 
LCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platformLCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platform
Linaro
 
Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequence
Houcheng Lin
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
Yan Vugenfirer
 
Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8
Linaro
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
Linaro
 
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMUSFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
Linaro
 
linux device driver
linux device driverlinux device driver
linux device driver
Rahul Batra
 
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
Linaro
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
Marian Marinov
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
ScyllaDB
 

Similar to from Binary to Binary: How Qemu Works (20)

HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
"Session ID: HKG18-TR08 Session Name: HKG18-TR08 - Upstreaming SVE in QEMU Speaker: Alex Bennée,Richard Henderson Track: Enterprise ★ Session Summary ★ ARM's Scalable Vector Extensions is an innovative solution to processing highly data parallel workloads. While several out-of-tree attempts at implementing SVE support for QEMU existed, we took a fundamentally different approach to solving key challenges and therefore pursued a from-scratch QEMU SVE implementation in Linaro. Our strategic choice was driven by several factors. First as an ""upstream first"" organisation we were focused on a solution that would be readily accepted by the upstream project. This entailed doing our development in the open on the project mailing lists where early feedback and community consensus can be reached. --------------------------------------------------- ★ Resources ★ Event Page: http://connect.linaro.org/resource/hkg18/hkg18-tr08/ Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-tr08.pdf Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-tr08.mp4 --------------------------------------------------- ★ Event Details ★ Linaro Connect Hong Kong 2018 (HKG18) 19-23 March 2018 Regal Airport Hotel Hong Kong --------------------------------------------------- Keyword: Enterprise 'http://www.linaro.org' 'http://connect.linaro.org' --------------------------------------------------- Follow us on Social Media https://www.facebook.com/LinaroOrg https://www.youtube.com/user/linaroorg?sub_confirmation=1 https://www.linkedin.com/company/1026961"
qemu architecture and internals - How it works
qemu architecture and internals - How it worksqemu architecture and internals - How it works
qemu architecture and internals - How it works
CartigayaneKeingadar
 
Qemu Internals
ESL Anyone?
ESL Anyone? ESL Anyone?
ESL Anyone?
DVClub
 
The document discusses ESL (electronic system level) design and some challenges with adopting ESL flows that use C/C++ as a design entry language. It provides examples of how C/C++ code can unintentionally result in inefficient hardware implementations if the designer does not consider the hardware implications. The document advocates that ESL adoption needs to be driven by designer needs and preferences rather than management decisions. It also argues that ESL tools need to provide predictability of results, education for designers on the hardware implications of different coding styles, and robust verification methods for ESL to be widely adopted.
MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103
Linaro
 
This presentation has been moved to this address: https://www.slideshare.net/linaroorg/the-challenge-of-sve-in-qemu-sfo17103-81026772
POWER processor and features presentation
POWER processor and features presentationPOWER processor and features presentation
POWER processor and features presentation
Ganesan Narayanasamy
 
POWER-AS is a 64-bit RISC architecture implemented by POWER processor chips. It is backward compatible with 32-bit PowerPC allowing 32-bit apps to run on 32- or 64-bit OSes. The architecture has general purpose and special purpose registers that are 64-bits wide, as well as floating point, decimal floating point, and vector/scalar instruction sets. It implements branches, conditionals, and other operations between registers in a RISC fashion.
Introduction to Embedded Systems a Practical Approach
Introduction to Embedded Systems a Practical ApproachIntroduction to Embedded Systems a Practical Approach
Introduction to Embedded Systems a Practical Approach
Amr Ali (ISTQB CTAL Full, CSM, ITIL Foundation)
 
This is a free module introducing embedded systems. It covers C programming, microcontrollers and software design in 40 ours. Its free for use in universities and institutes on condition of prior notification. Please, do not use it for commercial purposes. If you need full set If you need accompanying labs and software tool feel free to contact me by email (amraldo@hotmail.com) or by mobile (+201223600207).
Lec05
Lec05Lec05
Lec05
siddu kadiwal
 
The document summarizes key points about the 8086 microprocessor architecture and assembly language programming. It discusses the 8086 architecture including its registers, data path, and parallel execution units. It also provides examples of assembly language programs using loops and addressing modes. Finally, it outlines the topics to be covered in the next class, including a summary of 8085/8086/i386 architectures and assembly programming basics, as well as an introduction to device interfacing.
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Hsien-Hsin Sean Lee, Ph.D.
 
This document summarizes a lecture on dynamic scheduling and the Tomasulo algorithm. It begins with an overview of dynamic scheduling and out-of-order execution. It then describes the Tomasulo algorithm used in IBM's 360/91 floating point unit, which introduced reservation stations, register renaming, and a common data bus to enable out-of-order execution while maintaining in-order retirement. Examples are provided to illustrate how the algorithm handles register dependencies like RAW, WAR, and WAW.
Exploiting arm linux
Exploiting arm linuxExploiting arm linux
Exploiting arm linux
Dan H
 
This document provides an introduction to exploiting ARM Linux systems by describing the ARM architecture, assembly instructions, and techniques for exploitation such as stack overflows, altering control flow, and writing shellcode. It explains the ARM instruction set architecture including registers, status flags, instruction classes, and gives examples of assembly instructions for arithmetic, logic, branching, and other operations. The goal is to educate security researchers on analyzing and attacking ARM-based devices.
MPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdf
MPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdfMPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdf
MPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdf
anishasabesan
 
mpmc
Porting NetBSD to the open source LatticeMico32 CPU
Porting NetBSD to the open source LatticeMico32 CPUPorting NetBSD to the open source LatticeMico32 CPU
Porting NetBSD to the open source LatticeMico32 CPU
Yann Sionneau
 
In this talk I gave at EHSM 2014 event ( http://ehsm.eu ) I am explaining what a MMU is and how it works. I then explain how I ported NetBSD (and EdgeBSD which is a fork of NetBSD) on this open source LM32 CPU in which I added an MMU.
Hardware assited x86 emulation on godson 3
Hardware assited x86 emulation on godson 3Hardware assited x86 emulation on godson 3
Hardware assited x86 emulation on godson 3
Takuya ASADA
 
The document discusses hardware-assisted x86 emulation on the Loongson-3 processor. It provides background on the Loongson microprocessor family and describes several hardware techniques implemented in the Loongson-3 to improve the performance and efficiency of x86 emulation, including new instructions, content addressable memory, and context switch optimization. Benchmark results show the Loongson-3 achieving better SPEC CPU2000 performance than previous Loongson processors and comparable Intel processors.
Taking Back Embedded: The Erlang Embedded Framework
Taking Back Embedded: The Erlang Embedded FrameworkTaking Back Embedded: The Erlang Embedded Framework
Taking Back Embedded: The Erlang Embedded Framework
Omer Kilic
 
The presentation discusses using Erlang for embedded systems development. It provides an overview of Erlang and the actor model, and how they are well-suited for building robust, distributed and concurrent embedded applications. It then describes the Erlang Embedded Project which aims to apply Erlang to embedded domains. Examples are given of interfaces to hardware and concurrency demos. Future work exploring new hardware platforms and tooling is also discussed.
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
chiportal
 
The document discusses OpenCL for accelerating FPGA designs. It provides an overview of technology trends favoring parallelism and programmability. OpenCL is presented as a solution to bring FPGA design closer to software development by providing a standard programming model and faster compilation. The document describes how OpenCL maps to FPGAs by compiling kernels to hardware pipelines and discusses examples accelerated using OpenCL on FPGAs, including AES encryption, option pricing, document filtering, and video compression.
Managing register banks in the cloud with airhdl
Managing register banks in the cloud with airhdlManaging register banks in the cloud with airhdl
Managing register banks in the cloud with airhdl
Guy Eschemann
 
The document discusses airhdl, a web-based tool for managing FPGA register banks. It describes how airhdl allows users to define register maps graphically in a web browser. This includes registers, fields, access modes, and other properties. Airhdl then generates RTL code, C header files, and documentation based on the defined register map. It aims to simplify the process of maintaining consistency between hardware and software register definitions. The presentation concludes with a demo of airhdl and plans for future work.
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
Shinya Takamaeda-Y
 
This document provides information about using high-level programming languages to generate hardware implementations on FPGAs. It discusses how high-level synthesis (HLS) can be used to synthesize register transfer level (RTL) descriptions from C/C++ or Python code. This allows hardware to be programmed at a higher level of abstraction without having to manually write RTL code. Specific HLS tools mentioned include Xilinx Vivado HLS, Altera OpenCL, Veriloggen for Python, and synthesizing hardware from languages like C, C++, Java, and Python.
Chapter_04_ARM_Assembly.pptx ARM ASSEMBLY CODE
Chapter_04_ARM_Assembly.pptx   ARM ASSEMBLY CODEChapter_04_ARM_Assembly.pptx   ARM ASSEMBLY CODE
Chapter_04_ARM_Assembly.pptx ARM ASSEMBLY CODE
NagarathnaRajur2
 
GDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
MIPS-X
MIPS-XMIPS-X
MIPS-X
Zoltan Balazs
 
The document discusses building an emulation environment called MIPS-X for analyzing MIPS-based IoT devices. It introduces the presenters and their backgrounds in IoT and embedded security research. It then covers challenges in emulating MIPS CPUs and building toolchains, kernels, and filesystems to support running IoT firmware. The talk agenda is outlined which includes demos of using QEMU and Docker for MIPS emulation. Next steps discussed are refining emulation of device NVRAM and developing automated build systems for analyzing IoT firmware.
8086 MICROPROCESSOR
8086 MICROPROCESSOR8086 MICROPROCESSOR
8086 MICROPROCESSOR
Alxus Shuvo
 
Segment registers hold segment addresses and are used for memory addressing. The CS, DS, ES, FS, GS registers hold the code, data, and extra segments. The SS register holds the stack segment. The flag register indicates results of operations through flags like carry, zero, and overflow. It is used by conditional instructions.
IT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdf
IT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdfIT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdf
IT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdf
HuyNguyn540457
 
a
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
qemu architecture and internals - How it works
qemu architecture and internals - How it worksqemu architecture and internals - How it works
qemu architecture and internals - How it works
CartigayaneKeingadar
 
ESL Anyone?
ESL Anyone? ESL Anyone?
ESL Anyone?
DVClub
 
MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103
Linaro
 
POWER processor and features presentation
POWER processor and features presentationPOWER processor and features presentation
POWER processor and features presentation
Ganesan Narayanasamy
 
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Hsien-Hsin Sean Lee, Ph.D.
 
Exploiting arm linux
Exploiting arm linuxExploiting arm linux
Exploiting arm linux
Dan H
 
MPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdf
MPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdfMPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdf
MPMC PPT_MODULE 1&2 Jan202mmmmmmmmmmm4.pdf
anishasabesan
 
Porting NetBSD to the open source LatticeMico32 CPU
Porting NetBSD to the open source LatticeMico32 CPUPorting NetBSD to the open source LatticeMico32 CPU
Porting NetBSD to the open source LatticeMico32 CPU
Yann Sionneau
 
Hardware assited x86 emulation on godson 3
Hardware assited x86 emulation on godson 3Hardware assited x86 emulation on godson 3
Hardware assited x86 emulation on godson 3
Takuya ASADA
 
Taking Back Embedded: The Erlang Embedded Framework
Taking Back Embedded: The Erlang Embedded FrameworkTaking Back Embedded: The Erlang Embedded Framework
Taking Back Embedded: The Erlang Embedded Framework
Omer Kilic
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
chiportal
 
Managing register banks in the cloud with airhdl
Managing register banks in the cloud with airhdlManaging register banks in the cloud with airhdl
Managing register banks in the cloud with airhdl
Guy Eschemann
 
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
Shinya Takamaeda-Y
 
Chapter_04_ARM_Assembly.pptx ARM ASSEMBLY CODE
Chapter_04_ARM_Assembly.pptx   ARM ASSEMBLY CODEChapter_04_ARM_Assembly.pptx   ARM ASSEMBLY CODE
Chapter_04_ARM_Assembly.pptx ARM ASSEMBLY CODE
NagarathnaRajur2
 
8086 MICROPROCESSOR
8086 MICROPROCESSOR8086 MICROPROCESSOR
8086 MICROPROCESSOR
Alxus Shuvo
 
IT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdf
IT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdfIT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdf
IT3030E-CA-Chap3-ISA-Exercises_aaaaa.pdf
HuyNguyn540457
 

Recently uploaded (20)

"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai
Julio Chai
 
In the vast tapestry of the history of mathematics, where the brightest minds have woven with threads of logical reasoning and flash-es of intuition, the Riemann Hypothesis emerges as a mystery that chal-lenges the limits of human understanding. To grasp its origin and signif-icance, it is necessary to return to the dawn of a discipline that, like an incomplete map, sought to decipher the hidden patterns in numbers. This journey, comparable to an exploration into the unknown, takes us to a time when mathematicians were just beginning to glimpse order in the apparent chaos of prime numbers. Centuries ago, when the ancient Greeks contemplated the stars and sought answers to the deepest questions in the sky, they also turned their attention to the mysteries of numbers. Pythagoras and his followers revered numbers as if they were divine entities, bearers of a universal harmony. Among them, prime numbers stood out as the cornerstones of an infinite cathedral—indivisible and enigmatic—hiding their ar-rangement beneath a veil of apparent randomness. Yet, their importance in building the edifice of number theory was already evident. The Middle Ages, a period in which the light of knowledge flick-ered in rhythm with the storms of history, did not significantly advance this quest. It was the Renaissance that restored lost splendor to mathe-matical thought. In this context, great thinkers like Pierre de Fermat and Leonhard Euler took up the torch, illuminating the path toward a deeper understanding of prime numbers. Fermat, with his sharp intuition and ability to find patterns where others saw disorder, and Euler, whose overflowing genius connected number theory with other branches of mathematics, were the architects of a new era of exploration. Like build-ers designing a bridge over an unknown abyss, their contributions laid the groundwork for later discoveries.
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdfSoftware_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
VanshMunjal7
 
Software engineering in shortd
Silent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdf
Silent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdfSilent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdf
Silent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdf
EfrainGarrilloRuiz1
 
Manual AHU Silent-Aire
1. Mix Design M20 CT.pdf for M20 Grade mix design
1. Mix Design M20 CT.pdf for M20 Grade mix design1. Mix Design M20 CT.pdf for M20 Grade mix design
1. Mix Design M20 CT.pdf for M20 Grade mix design
smghumare
 
Concrete Mix Design for M20 Grade
Introduction to Machine Vision by Cognex
Introduction to Machine Vision by CognexIntroduction to Machine Vision by Cognex
Introduction to Machine Vision by Cognex
RicardoCunha203173
 
Introducion to Machine Vision - COGNEX
Department of Environment (DOE) Mix Design with Fly Ash.
Department of Environment (DOE) Mix Design with Fly Ash.Department of Environment (DOE) Mix Design with Fly Ash.
Department of Environment (DOE) Mix Design with Fly Ash.
MdManikurRahman
 
Concrete Mix Design with Fly Ash by DOE Method. The Department of Environmental (DOE) approach to fly ash-based concrete mix design is covered in this study. The Department of Environment (DOE) method of mix design is a British method originally developed in the UK in the 1970s. It is widely used for concrete mix design, including mixes that incorporate supplementary cementitious materials (SCMs) such as fly ash. When using fly ash in concrete, the DOE method can be adapted to account for its properties and effects on workability, strength, and durability. Here's a step-by-step overview of how the DOE method is applied with fly ash.
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
Mathias Magdowski
 
In this lecture, I explain the fundamentals of electromagnetic compatibility (EMC), the basic coupling model and coupling paths via cables, electric fields, magnetic fields and wave fields. We also look at electric vehicles as an example of systems with many conducted EMC problems due to power electronic devices such as rectifiers and inverters with non-linear components such as diodes and fast switching components such as MOSFETs or IGBTs. After a brief review of circuit analysis fundamentals and an experimental investigation of the frequency-dependent impedance of resistors, capacitors and inductors, we look at a simple low-pass filter. The input impedance from both sides as well as the transfer function are measured.
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her IndustryTesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia
 
Tesia Dobrydnia brings her many talents to her career as a chemical engineer in the oil and gas industry. With the same enthusiasm she puts into her work, she engages in hobbies and activities including watching movies and television shows, reading, backpacking, and snowboarding. She is a Relief Senior Engineer for Chevron and has been employed by the company since 2007. Tesia is considered a leader in her industry and is known to for her grasp of relief design standards.
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos12
 
.........hbhvhbhb
Design of a Hand Rehabilitation Device for Post-Stroke Patients..pptx
Design of a Hand Rehabilitation Device for Post-Stroke Patients..pptxDesign of a Hand Rehabilitation Device for Post-Stroke Patients..pptx
Design of a Hand Rehabilitation Device for Post-Stroke Patients..pptx
younisalsadah
 
Designing a hand rehabilitation device for post-stroke patients. Stimulation is achieved through movement and control via a program on a mobile phone. The fingers are not involved in the movement, as this is a separate project.
ENERGY STORING DEVICES-Primary Battery.pdf
ENERGY STORING DEVICES-Primary Battery.pdfENERGY STORING DEVICES-Primary Battery.pdf
ENERGY STORING DEVICES-Primary Battery.pdf
TAMILISAI R
 
ENERGY STORING DEVICES Batteries -Introduction – Cells – Batteries –Types of Batteries- Primary batteries – silver button cell
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdfKevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Medicoz Clinic
 
Kevin Corke, a respected American journalist known for his work with Fox News, has always kept his personal life away from the spotlight. Despite his public presence, details about his spouse remain mostly private. Fans have long speculated about his marital status, but Corke chooses to maintain a clear boundary between his professional and personal life. While he occasionally shares glimpses of his family on social media, he has not publicly disclosed his wife’s identity. This deep dive into his private life reveals a man who values discretion, keeping his loved ones shielded from media attention.
UNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power SystemUNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power System
Sridhar191373
 
Introduction Conceptual Model of the EMS EMS Functions and SCADA Applications. Time decomposition of the power system operation. Open Distributed system in EMS OOPS
DIY Gesture Control ESP32 LiteWing Drone using Python
DIY Gesture Control ESP32 LiteWing Drone using  PythonDIY Gesture Control ESP32 LiteWing Drone using  Python
DIY Gesture Control ESP32 LiteWing Drone using Python
CircuitDigest
 
Build a gesture-controlled LiteWing drone using ESP32 and MPU6050. This presentation explains components, circuit diagram, assembly steps, and working process. Read more : https://circuitdigest.com/microcontroller-projects/diy-gesture-controlled-drone-using-esp32-and-python-with-litewing Ideal for DIY drone projects, robotics enthusiasts, and embedded systems learners. Explore how to create a low-cost, ESP32 drone with real-time wireless gesture control.
HVAC Air Filter Equipment-Catalouge-Final.pdf
HVAC Air Filter Equipment-Catalouge-Final.pdfHVAC Air Filter Equipment-Catalouge-Final.pdf
HVAC Air Filter Equipment-Catalouge-Final.pdf
FILTRATION ENGINEERING & CUNSULTANT
 
Optimize Indoor Air Quality with Our Latest HVAC Air Filter Equipment Catalogue Discover our complete range of high-performance HVAC air filtration solutions in this comprehensive catalogue. Designed for industrial, commercial, and residential applications, our equipment ensures superior air quality, energy efficiency, and compliance with international standards. 📘 What You'll Find Inside: Detailed product specifications High-efficiency particulate and gas phase filters Custom filtration solutions Application-specific recommendations Maintenance and installation guidelines Whether you're an HVAC engineer, facilities manager, or procurement specialist, this catalogue provides everything you need to select the right air filtration system for your needs. 🛠️ Cleaner Air Starts Here — Explore Our Finalized Catalogue Now!
BEC602- Module 3-2-Notes.pdf.Vlsi design and testing notes
BEC602- Module 3-2-Notes.pdf.Vlsi design and testing notesBEC602- Module 3-2-Notes.pdf.Vlsi design and testing notes
BEC602- Module 3-2-Notes.pdf.Vlsi design and testing notes
VarshithaP6
 
Vlsi design and testing notes
All about the Snail Power Catalog Product 2025
All about the Snail Power Catalog  Product 2025All about the Snail Power Catalog  Product 2025
All about the Snail Power Catalog Product 2025
kstgroupvn
 
Snail Power Catalog 2025
Proposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor RuleProposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor Rule
AlvaroLinero2
 
Florida Section AWMA Presentation on Proposed EPA Municipal Waste Combustor Rule. Reviews EPA procedures to set standards and pitfalls.
ISO 5011 Air Filter Catalogues .pdf
ISO 5011 Air Filter Catalogues      .pdfISO 5011 Air Filter Catalogues      .pdf
ISO 5011 Air Filter Catalogues .pdf
FILTRATION ENGINEERING & CUNSULTANT
 
This presentation provides a comprehensive overview of air filter testing equipment and solutions based on ISO 5011, the globally recognized standard for performance testing of air cleaning devices used in internal combustion engines and compressors. Key content includes:
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
RishabhGupta578788
 
Certification of participation for the tata crucibal campus quiz 2024
"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai
Julio Chai
 
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdfSoftware_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
VanshMunjal7
 
Silent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdf
Silent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdfSilent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdf
Silent-Aire Quality Orientation - OFCI_GC - EVAP Unit REV2.pdf
EfrainGarrilloRuiz1
 
1. Mix Design M20 CT.pdf for M20 Grade mix design
1. Mix Design M20 CT.pdf for M20 Grade mix design1. Mix Design M20 CT.pdf for M20 Grade mix design
1. Mix Design M20 CT.pdf for M20 Grade mix design
smghumare
 
Introduction to Machine Vision by Cognex
Introduction to Machine Vision by CognexIntroduction to Machine Vision by Cognex
Introduction to Machine Vision by Cognex
RicardoCunha203173
 
Department of Environment (DOE) Mix Design with Fly Ash.
Department of Environment (DOE) Mix Design with Fly Ash.Department of Environment (DOE) Mix Design with Fly Ash.
Department of Environment (DOE) Mix Design with Fly Ash.
MdManikurRahman
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
Mathias Magdowski
 
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her IndustryTesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia
 
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos12
 
Design of a Hand Rehabilitation Device for Post-Stroke Patients..pptx
Design of a Hand Rehabilitation Device for Post-Stroke Patients..pptxDesign of a Hand Rehabilitation Device for Post-Stroke Patients..pptx
Design of a Hand Rehabilitation Device for Post-Stroke Patients..pptx
younisalsadah
 
ENERGY STORING DEVICES-Primary Battery.pdf
ENERGY STORING DEVICES-Primary Battery.pdfENERGY STORING DEVICES-Primary Battery.pdf
ENERGY STORING DEVICES-Primary Battery.pdf
TAMILISAI R
 
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdfKevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Medicoz Clinic
 
UNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power SystemUNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power System
Sridhar191373
 
DIY Gesture Control ESP32 LiteWing Drone using Python
DIY Gesture Control ESP32 LiteWing Drone using  PythonDIY Gesture Control ESP32 LiteWing Drone using  Python
DIY Gesture Control ESP32 LiteWing Drone using Python
CircuitDigest
 
BEC602- Module 3-2-Notes.pdf.Vlsi design and testing notes
BEC602- Module 3-2-Notes.pdf.Vlsi design and testing notesBEC602- Module 3-2-Notes.pdf.Vlsi design and testing notes
BEC602- Module 3-2-Notes.pdf.Vlsi design and testing notes
VarshithaP6
 
All about the Snail Power Catalog Product 2025
All about the Snail Power Catalog  Product 2025All about the Snail Power Catalog  Product 2025
All about the Snail Power Catalog Product 2025
kstgroupvn
 
Proposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor RuleProposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor Rule
AlvaroLinero2
 
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
RishabhGupta578788
 

from Binary to Binary: How Qemu Works

  • 1. from Binary to Binary: How Qemu Works 魏禛 (@_zhenwei_) <zhenwei.tw@gmail.com> 林致民 (Doraemon) <r06944005@csie.ntu.edu.tw> August 11, 2018 / COSCUP 2018
  • 2. Who are we? ● 林致民 (Doraemon) ○ Master Student @ NTU Compiler Optimization and Virtualization Lab ○ Insterested in compiler optimization and system performance ● 魏禛 (@_zhenwei_) ○ From Tainan, Taiwan ○ Master student @ NTU ○ Interested in Computer Architecture, Virtual Machine and Compiler stuff 2
  • 3. Outline ● Introduction of Qemu ● Guest binary to TCG-IR translation ● Block Chaining ! ● TCG-IR to x86_64 translation ● Do not cover ... ○ Full system emulation ○ Interrupt handling ○ Multi-thread implementation ○ Optimization ... 3
  • 4. ● Created by Fabrice Bellard in 2003 ● Features ○ Just-in-time (JIT) compilation support ot achieve high performance ○ Cross-platform (most UNIX-like system and MS-Windows) ○ Lots of target hosts and targets support (full system emulation) ■ x86, aarch32, aarch64, mips, sparc, risc-v ○ User mode emulation: Qemu can run applications compiled for another CPU (same OS) ● More excellent slides ! ○ Qemu JIT Code Generator and System Emulation ○ QEMU - Binary Translation What is Qemu 4
  • 5. Environment ● Guest (target) machine: RISC-V ● Host machine: Intel x86_64 ● Tools ○ Qemu 1.12.0 https://www.qemu.org ○ RISC-V GNU Toolchain https://github.com/riscv/riscv-gnu-toolchain.git 5
  • 6. Translation Block (tb) ● Definition of translation block (tb) ○ Encounter the branch (modify PC) ○ Encounter the system call ○ Reach the page boundary The picture is referenced from “QEMU - Binary Translation” by Jiann-Fuh Liaw 6
  • 7. Dynamic Binary Translation ● Translate guest ISA instuction to Host ISA instruction (runtime) ● After the translation block is executed, the control come back to the Qemu 7
  • 8. Dynamic Binary Translation ● Block Chaining - avoid the “context switching” overhead 8
  • 9. Tiny Code Generator (TCG) 9 It is referenced from tcg/README
  • 10. Qemu Execution Flow Find the Translation Block Execute the Translation Block Generate the Translation Block Chain the generated Translation Block to existed Block 10
  • 12. Data structures would be used later ... TCGContext TranslationBlock DisasContext CPURISCVState 12
  • 13. The main loop cpu_exec() @ accel/tcg/cpu-exec.c ● tb_find() ○ Find the desired translation block by pc value ● cpu_loop_exec_tb() ○ Execute the native code in the translation block 13
  • 14. Find the desired Translation Block or create one tb_find() @ accel/tcg/cpu-exec.c ● tb_lookup__cpu_state() ○ Find the specific tb (Translation Block) by pc value ● tb_gen_code() ○ If the desired tb hasn’t been generated yet, we just create one ● tb_add_jump() ○ The block chaining patch point! ○ We will talk about it later ... 14
  • 15. The Translation Block Finding Algorithm tb_lookup__cpu_state() @ include/exec/tb-lookup.h ● tb_jmp_cache_hash_func() ○ A level-1 lookup cache is implemented (fast path) ● tb_htable_lookup() ○ A traditional hash table is used to find the specific tb by hash value (slow path) ○ Update the level-1 lookup cache if found it 15
  • 16. The Translation Block Finding Algorithm bzip2 mcf # of tb executed 32877233 8125325 tb-cache miss ratio 0.01 % 2.10 % Some tbs would be executed many times throughout the program 40% perfmance loss 20% perfmance loss 16 time (sec)
  • 17. Start to generate the Translation Block tb_gen_code() @ accel/tcg/translate-all.c ● tb_alloc() ○ This function would allocate the space from the Code Cache in TCGContext ○ If the Code Cache is full, just flush it ! ● gen_intermediate_code() ○ We will get the TCG-IR produced in this function, which is stored in the TCGContext ● tcg_gen_code() ○ Generate the host machine code according to the TCG-IR, and it would be stored in the tb 17
  • 18. Generate the TCG-IR first ! gen_intermediate_code() @ target/riscv/translate.c ● The while loop would decode each instruction in the guest binary until encounter the branch or reach the tb size ● decode_opc() ○ Decode the instruction in binary form 18
  • 19. Decode the instruction in guest binary decode_RV32_64G() @ target/riscv/translate.c 19
  • 20. Generate the TCG-IR first ! (E.g. arithmetic instr.) gen_arith() @ target/riscv/translate.c ● The guest instruction need to be implemented in the TCG variable system ● The TCG variables are declared and can be assigned the value from the architecture state or constants ● The TCG frontend ops would operate on these TCG variables. 20
  • 21. Generate the TCG-IR first ! (E.g. arithmetic instr.) tcg_gen_? @ tcg/tcg-op.h & tcg/tcg-op.c ● tcg_emit_op() ○ This function will allocate the space from TCGContext and insert it into the ops linked-list ● After getting the allocated space, the tcg opcode and arguments are filled into it. ● The TCG instrution generated would be showed later ... 21
  • 23. Generate the TCG-IR first ! (E.g. branch instr.) gen_branch() @ target/riscv/translate.c ● The Label also needed to generated via TCG-IR form ● gen_goto_tb() ○ Jump into the specific translation block ! ● When encountered the branch, which means it the end of the tb. ○ The ctx->bstate is set to break the outer while loop in gen_intermediate_code() 23
  • 24. How Block Chaining works? A slot waits for patching. If not patched yet, it would just jump to the next instruction The location of this slot would be recorded in the tb->jump_target_arg when generating host machine code The generating tb will patch the last executed tb24 The Patch Point
  • 25. Translate TCG-IR to x86_64 binary code 25
  • 26. TCG-IR to x86_64 translation ● Before entering the backend … 26
  • 27. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: Store tmp0 = tmp0 + 1156(0x484) Load s2 to tmp0 Load s3 to tmp1 Store tmp1 back to address tmp0How does QEMU generate binary code? 27
  • 28. TCG-IR to x86_64 translation ● Let’s take a look at some examples: add RISC-V 32bit TCG-IR x86_64 28
  • 29. TCG-IR to x86_64 translation ● Let’s take a look at some examples: add RISC-V 32bit TCG-IR x86_64 Architecture States Data Structure 29 Pointer
  • 30. TCG-IR to x86_64 translation ● Let’s take a look at some examples: add RISC-V 32bit TCG-IR x86_64 Architecture States Data Structure 30 General Purpose Registers Floating Point Registers
  • 31. TCG-IR to x86_64 translation ● Let’s take a look at some examples: add RISC-V 32bit TCG-IR x86_64 Load data from architecture state a5 reg. 31
  • 32. TCG-IR to x86_64 translation ● Let’s take a look at some examples: add RISC-V 32bit TCG-IR x86_64 tmp0 = tmp0 + tmp1 Load data from architecture state a5 32
  • 33. TCG-IR to x86_64 translation ● Let’s take a look at some examples: add RISC-V 32bit TCG-IR x86_64 tmp0 = tmp0 + tmp1 Load data from architecture state a5 Store tmp0 back to architecture state a5 reg 33
  • 34. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: load Load data from address ‘tmp0’ to tmp1 34
  • 35. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: load Store tmp1 back to archi. state a4 reg Load data from address ‘tmp0’ to tmp1 35
  • 36. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: Store Load s2 to tmp0 36
  • 37. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: Store tmp0 = tmp0 + 1156(0x484) Load s2 to tmp0 37
  • 38. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: Store tmp0 = tmp0 + 1156(0x484) Load s2 to tmp0 Load from s3 to tmp1 38
  • 39. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: Store tmp0 = tmp0 + 1156(0x484) Load s2 to tmp0 Load s3 to tmp1 39 Store tmp1 to address tmp 0
  • 40. TCG-IR to x86_64 translation RISC-V 32bit TCG-IR x86_64 ● Let’s take a look at some examples: Store tmp0 = tmp0 + 1156(0x484) Load s2 to tmp0 Load s3 to tmp1 Store tmp1 back to address tmp0How does QEMU handle branch instructions? 40
  • 41. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Branch ○ Direct Branch ■ Conditional Branch beqz rs, offset / bgt rs, rt, offset / ….. ■ Unconditional Branch j offset / jal offset / call offset / ... ○ Indirect Branch ■ Switch/Case → Branch table ■ Indirect function call ■ Return Instructions (ret) 41
  • 42. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Unconditional) RISC-V 32bit TCG-IR x86_64 42
  • 43. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Unconditional) RISC-V 32bit TCG-IR x86_64 Remind: Patch point for block chaining 43
  • 44. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Unconditional) RISC-V 32bit TCG-IR x86_64 Synchronize program counter to architecture states 44
  • 45. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Unconditional) RISC-V 32bit TCG-IR x86_64 Synchronize program counter Prepare return value 45
  • 46. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Unconditional) RISC-V 32bit TCG-IR x86_64 Synchronize program counter Prepare return value Go back to QEMU to find next Translation Block 46
  • 47. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Unconditional) RISC-V 32bit TCG-IR x86_64 Remind: Patch point 47 Block Chaining: Link the current TB to previous TB by patching the jump target address
  • 48. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Conditional) 48
  • 49. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Conditional) 49
  • 50. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Conditional) 50
  • 51. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Conditional) If the block is chained by QEMU, jump to target translation block 51
  • 52. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Direct Branch (Conditional) Go back to QEMU to find next Translation Block 52
  • 53. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Indirect Branch - return instruction RISC-V 32bit TCG-IR x86_64 53
  • 54. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Indirect Branch - return instruction RISC-V 32bit TCG-IR x86_64 54 Store the value from return address register to program counter
  • 55. TCG-IR to x86_64 translation ● Let’s take a look at some examples: Indirect Branch - return instruction RISC-V 32bit TCG-IR x86_64 Go back to QEMU to find next Translation Block in Program Counter 55
  • 56. TCG-IR to x86_64 translation ● helper function call ○ QEMU provides a ‘hook’ for developers to write emulation behavior in C language. Commonly used in: ■ Emulate hardware not supported in host machine ● e.g. Hardware FP, SIMD, AES, etc. ■ Dynamic Instrumentation ● Collect runtime information from source program to analyze program’s behavior (e.g. Dynamic call graph / control flow graph) 56
  • 57. TCG-IR to x86_64 translation ● helper function call - example call helper_function return result (%rax) to ‘fa5’ 57
  • 58. TCG-IR to x86_64 translation ● helper - Emulate IEEE 754 floating point add with double precision 58
  • 59. TCG-IR to x86_64 translation ● Prologue, epilogue - The entry point for each Translation Block 59
  • 60. TCG-IR to x86_64 translation ● Prologue, epilogue ○ Decide whether current TB can be executed 60
  • 61. TCG-IR to x86_64 translation ● Prologue, epilogue How to execute translated binary code? 61
  • 62. x86_64 code execution ● Entry point - qemu-riscv/accel/tcg/cpu-exec.c 62 tcg_qemu_tb_exec: A pointer targeting to the head of translation block
  • 63. x86_64 code execution ● Entry point - qemu-riscv/accel/tcg/cpu-exec.c 63 Put env to %rdi, tb_ptr to %rsi
  • 64. x86_64 code execution ● Entry point - qemu-riscv/accel/tcg/cpu-exec.c 64 Move %rdi to %r14, and jump %rsi
  • 65. x86_64 code execution ● Entry point - qemu-riscv/accel/tcg/cpu-exec.c 65 Move %rdi to %r14, and jump %rsi
  • 66. x86_64 code execution ● Entry point - qemu-riscv/accel/tcg/cpu-exec.c 66 Exit code cache and go back to QEMU
  • 67. x86_64 code execution ● Entry point - qemu-riscv/accel/tcg/cpu-exec.c 67 Return the value to ret, record last TB we just execute.
  • 68. Reference ● Qemu JIT Code Generator and System Emulation ● QEMU - Binary Translation ● QEMU TCG Frontend Ops ● RISC-V Insturction Set Manual ● Doraemon’s Notes: QEMU Backend ○ Written in Mandarin Chinese 68
Morty Proxy This is a proxified and sanitized view of the page, visit original site.