In this article, we will learn about EVM, short for Ethereum Virtual Machine, about which you must have heard if you’ve ever developed a smart contract on the Ethereum blockchain network. Virtual machines are the level of abstraction between the executing code and the executing machine. This layer is needed to enhance software portability and ensure applications are segregated from each other and their underlying layers.

## Prerequisites

Some basic familiarity with common terminologies in computer science, such as bytes[1], memory[2], and a stack[3], are necessary to understand the EVM.

It will also be helpful if you have a basic knowledge of the blockchain[4] and the Ethereum network[5].

## What is Ethereum Virtual Machine

The term “Distributed ledger” is often used to describe blockchain like Bitcoin, which enables a decentralized currency using fundamental tools of cryptography. While Ethereum has its own native cryptocurrency (Ether) that follows almost exactly the same intuitive rules, it also enables a much more powerful “Ethereum Virtual Machine.” This state machine can execute arbitrary machine code.

### Ethereum Virtual Machine architecture

The EVM is a Turing-complete state machine because all execution steps are limited to a finite number of computational steps. This is different from Bitcoin when on Bitcoin, the Stack machine is just a Turing-incomplete machine.

The code in Ethereum contracts is written in a low-level, stack-based bytecode language. The code consists of a series of bytes, where each byte represents an operation. The operations have access to three types of space in which to store data:

• The Stack is a last-in-first-out container to which values can be pushed and popped
• Memory is an infinitely expandable byte array but is short-term memory
• The contract's long-term storage, a key/value store. Unlike stack and memory, which reset after computation ends, storage persists for the long term.

### Opcode

Smart contract languages like Solidity cannot be executed by the EVM directly. Instead, they are compiled into low-level machine instructions. Under the hood, the EVM uses opcodes - a set of instructions to execute specific tasks. Until the present time, there are 140 unique opcodes. Together, these opcodes allow the EVM to be Turing-complete. This means the EVM is able to compute anything. We can split all opcodes into the following categories:

• Stack-manipulating opcodes: PUSH, POP, DUP, SWAP
• Arithmetic/comparison/bitwise opcodes: ADD, SUB, GT, LT, AND, OR
• Environmental opcodes: CALLER, CALLVALUE, NUMBER
• Memory-manipulating opcodes: MLOAD, MSTORE, MSTORE8, MSIZE
• Program counter related opcodes: JUMP, JUMPI, PC, JUMPDEST
• Halting opcodes: STOP, RETURN, REVERT, INVALID, SELFDESTRUCT

## How EVM work

### EVM execution

The formal execution model of EVM code is very simple. While the Ethereum virtual machine is running, its full computational state can be defined by the tuple (block_state, transaction, message, code, memory, stack, pc, gas), where block_state is the global state containing all accounts and includes balances and storage. At the start of every round of execution, the current instruction is found by taking the pcth byte of code (or 0 if pc >= len(code)), and each instruction has its own definition in terms of how it affects the tuple. Although there are many ways to optimize Ethereum virtual machine execution via just-in-time compilation, a basic implementation of Ethereum can be done in a few hundred lines of code.

### Compile Solidity to EVM Bytecode

In this section, we will go into detail about how the Solidity script is compiled to bytecode. First of all, we need to install the Solidity compiler.

$npm install -g solc Then, we have a simple smart contract script written by Solidity. // EVMExample.sol pragma solidity ^0.8.3; contract EVMExample { uint256 number = 9; } EVMExample contract is a simple smart contract, which declares an uint256 variable with a value equal to 9. We will compile the above smart contract by executing the command: $ solcjs -o bytescode --bin ./EVMExample.sol
// bytescode/EVM_sol_EVMExample.bin
60806040526009600055348015601457600080fd5b50603f8060226000396000f3fe6080604052600080fdfea26469706673582212201fdff2ede0793f052591ede6217c413130c4545276017d660c5c9b1c6642343e64736f6c634300080d0033

We can split the result bytecode into three parts:

1. Constructor 60806040526009600055348015601457600080fd5b50603f8060226000396000f3fe

2. Runtime

6080604052600080fdfe

At the end of this byte code is a Swarm hash of a metadata file created by Solidity. Swarm is decentralized file storage. Although the Swarm hash will also be included in the runtime bytecode, it will never be interpreted as opcodes by the EVM because its location can never be reached. Currently, Solidity utilizes the following format:

0xa1 0x65 ‘b’ ‘z’ ‘z’ ‘r’ ‘0’ 0x58 0x22 [32 bytes swarm hash] 0x00 0x33

In this case, the Swarm hash is:

12201fdff2ede0793f052591ede6217c413130c4545276017d660c5c9b1c6642343e64736f6c634300080d

The metadata file contains information about the contract, such as the compiler version and the contract’s functions.

## Conclusion

Ethereum provides a decentralized ecosystem for developers to deploy, execute the smart contract on EVM, and build decentralized applications on it. Although executing the smart contract on EVM might be a lot more expensive than running programs on traditional servers, The outstanding advantage of EVM is undeniable.

## References

[1] Byte, Wikipedia, accessed March 13th, 2022.

[2] Computer memory, Wikipedia, access March 13th, 2022.

[3] Stack (abstract data type), Wikipedia, accessed March 13th, 2022.

[4] Blockchain, Wikipedia, accessed March 13th, 2022.

[5] Ethereum, Wikipedia, accessed March 13th, 2022.

[6] Ethereum Virtual Machine (EVM), Ethereum documents, accessed March 13th, 2022.

[7] Ethereum EVM illustrated, Ethereum documents, accessed March 13th, 2022.

[8] The Ethereum Virtual Machine — How does it work?, accessed March 13th, 2022.