cancelOrder()
function. You'll learn more about smart contracts and transaction details in later chapters, but for now, be aware that blockchain technology keeps a record of every change to blockchain data, which provides a great place to get analytics data.
FIGURE 3-4: Exploring additional transaction details in Etherscan.
Dissecting the parts of a block
Before you can start extracting data from a blockchain for analysis, you need to learn a little more about how the data you want is stored. Most of that data is stored in blocks, so that's what I discuss next.
Yes, that’s right! Most but not all data you’ll want for analytics is stored in blocks. Some blockchains, including Ethereum, store some data in an external, or off-chain, database. Don’t worry; I describe off-chain data too.
I describe only the basic Ethereum block and chain details. The authoritative reference for Ethereum internals is the Ethereum yellow paper, at
https://ethereum.github.io/yellowpaper/paper.pdf
. You can also find a good third-party detailed discussion of Ethereum block structure internals at https://ethereum.stackexchange.com/questions/268/ethereum-block-architecture
.
A block is a data structure that contains two main sections: a header and a body. Transactions are added to the body and then submitted to the blockchain network. Miners take the blocks and try to solve a mathematical puzzle to win a prize. Miners are just nodes, or pools of nodes, with enough computational power to calculate block hashes many times to solve the puzzle.
In Ethereum, the mining process uses the submitted block header and an arbitrary number called a nonce (number used once). The miner chooses a value for the nonce, which is part of the block header, and calculates a hash value using a hash function on the block header. The result has to match an agreed-upon pattern, which gets more difficult over time as miners get faster at mining blocks. If the first mining result doesn’t match the pattern, the miner chooses another nonce and calculates a hash on the new block header. This process continues until a miner finds a nonce that results in a hash that matches the pattern.
The miner that finds the solution broadcasts that solution to the rest of the network. That miner collects a reward, in ETH (ether), for doing the hard work to validate the block. Because many miners work on blocks at the same time, it's common for several miners to solve the hash puzzle at almost the same time. In other blockchains, these blocks are discarded as orphans. In Ethereum, these blocks are called uncles. An uncle block is any successfully mined block that arrives after that block has already been accepted. Ethereum accepts uncle blocks and even provides a reward to the miner, but one that's smaller than the accepted block.
Ethereum rewards miners that solve uncle blocks to reduce mining centralization and to increase the security of the blockchain. Uncle rewards provide an incentive for smaller miners to participate. Otherwise, mining would be profitable only for large pools that could eventually take over all mining. Encouraging more miners to participate also increases security by increasing the overall work carried out on the entire blockchain.
The header of a block contains data that describes the block, and the body contains all transactions stored in a block. Figure 3-5 shows the contents of an Ethereum block header.
Ethereum uses the Keccak-256 algorithm to produce all hash values. The National Institute of Standards and Technology (NIST) Secure Hashing Algorithm 3 (SHA-3) is a subset of the Keccak algorithm. Ethereum was introduced before the SHA-3 standard was finalized, and Keccak-256 does not follow the SHA-3 official standard.
FIGURE 3-5: Ethereum block header.
Each Ethereum block header contains information that defines and describes the block, and records its place in the blockchain. The block header contains the following fields:
Previous hash: The hash value of the previous block’s header, where the previous block is the last block on the blockchain when the current block gets added.
Nonce: A number that causes the hash value of the current block’s header to adhere to a specific pattern. If you change this value (or any header value), the hash of the header changes.
Timestamp: The date and time the current block was created.
Uncles hash: The hash value of the current block’s list of uncle blocks, which are stale blocks that were successfully mined but arrived just after the accepted block was added to the blockchain.
Beneficiary: The miner’s account that receives the reward for mining this block.
Logs bloom: Logging information, stored in a Bloom filter (a data structure useful for quickly finding out if some element is a member of a set).
Difficulty: The difficulty level for mining this block.
Extra data: Any extra data used to describe this block. Miners can put any data here they want, or they can leave it blank. For example, some miners write data that they can use to identify blocks they mined.
Block number: The unique number for this block (assigned sequentially).
Gas limit: The limit of gas for this block. (You learn about gas later in this chapter.)
Gas used: The amount of gas used by transactions in this block.
Mix hash: A hash value combined with the nonce value to show that the mined nonce meets the difficulty requirements. This hash makes it more difficult for attackers to modify the block.
State root: The hash value of the root node of the block’s state trie. A trie is a data structure that efficiently stores data for quick retrieval. The state trie expresses information about the state of transactions in the block without having to look at the transactions.
Transaction root: The hash value of the root node of the trie, which stores all transactions for this block.
Receipt root: The hash value of the root node of the trie, which stores all transaction receipts for this block.
The body of an Ethereum block is just a list of transactions. Unlike other blockchain implementations, the number of transactions, and as a result the size of the blocks, isn’t fixed. Every transaction has a processing cost associated with it, and each block has a limited budget. Ethereum blocks can contain lots of transactions that don’t cost much or just a few expensive ones or anything in between. Ethereum designed a lot of flexibility into what blocks can contain. Figure 3-6 shows the contents of an Ethereum transaction.