COMP3221
Assignment 2: Blockchain
Due: Friday, Week 11 by 23:59
1 Introduction
In this assignment, you will implement a peer-to-peer (P2P) blockchain system. A blockchain is a chain of blocks (like pages in a ledger), distributed across multiple nodes. Each block contains a set of transactions, which represent changes in ownership of digital assets or other logged information. By completing this project, you will create a network of blockchain nodes that communicate and cooperate to maintain a consistent distributed ledger.
1.1 Learning Objectives
• Understand key blockchain concepts and terminology by implementing them in code.
• Learn how a P2P network model operates and how to design communication between distributed nodes.
• Gain experience in developing a concurrent networked application (with sockets and threads).
• Apply principles of crash fault-tolerant consensus in a practical setting.
• Improve programming skills in networking and distributed system design, with a focus on blockchain mechanics.
2 Program Structure and Implementation Requirements
You may implement your solution in any programming language of your choice. However, your submission must include a wrapper script named Run .sh that serves as the single entry point. For example, if you choose Python, your wrapper script. might look as follows:
1 #!/bin/bash
2 python3 your_solution.py "$@"
You are not permitted to use any libraries outside the core libraries provided for your pro- gramming language. For example, if you were to use Python, you may use libraries such as socket, but not networkx. Using any disallowed libraries will result in an immediate 0 for this assignment. If you are unsure if a specific library is allowed, please ask on Ed.
You are also permitted to use a build script that compiles your program. This script must be called Build .sh, and will execute before any of the test cases are run on your program.
You are permitted to use any libraries that come built into your chosen language, as well as a select range of extra libraries. For the list of allowed libraries, please see Appendix A.
3 Assignment Task
You will implement the software for a single blockchain node (peer). Running multiple instances of your node program will form a P2P network. Each node must act both as a server (listening for incoming requests from peers) and as a client (initiating requests to other peers). Nodes communicate by exchanging structured network messages over TCP connections. The ultimate goal is for all non-faulty nodes in the network to reach consensus on a sequence of blocks, thereby maintaining a consistent blockchain.
Every node will maintain a TCP server on a specified port to accept peer connections. It will also initiate outgoing connections to all other known peer nodes. The network topology is fully connected: each node should have an independent, long-lived TCP connection to every other node in the peer list. Nodes must handle connectivity issues gracefully – for example, if a connection drops or a peer crashes, the node should detect this (via timeouts) and continue operating with the remaining peers.
Each node operates concurrently with multiple threads, for example:
• Server Thread(s): Accept incoming TCP connections from peers. For each connected peer, use a dedicated handler (thread) to receive and process messages from that peer.
• Consensus (Pipeline) Thread: Continuously monitor the node’s transaction pool and coordinate the block proposal and consensus process (detailed below in Section 8).
All nodes in the network will run the same consensus protocol in synchronous rounds to agree on the next block to append. By the end of each consensus round, every correct node should have decided on the same new block to add to its blockchain (assuming not too many nodes have crashed).
3.1 Transaction Pool and Validation
Incoming transactions (from clients or other nodes) are collected in a transaction pool (mempool). The node will store transactions in this pool temporarily until they are included in a new block via the consensus process. Each transaction is a JSON object containing spe- cific fields (see Section 5.2). Upon receiving a transaction (via a peer message), the node must validate it before adding it to its pool. Invalid transactions must be explicitly re- fused. The validation rules are:
• Correct Nonce Sequence: Each transaction has a nonce field, which must equal the number of transactions from the same sender that have already been confirmed on the blockchain (i.e. the sender’s transaction count in the current chain state). For example, if a sender has 5 transactions in the blockchain (nonce 0 through 4), the next transaction’s nonce must be 5. If a received transaction’s nonce is not exactly one greater than the sender’s confirmed nonce (or 0 if the sender has none yet), then the transaction is out-of-order or duplicate and must be rejected. There must be no two different transactions with the same sender and same nonce in the pool. (The nonce is monotonically increasing per sender, starting from 0.)
• Signature Verification: Every transaction is signed by its sender. The signature field must be a valid Ed25519 signature over the transaction’s contents, and it must correspond to the given sender public key. The node must verify the signature using the sender’s public key. If the signature is invalid, the transaction is invalid and should be rejected.
• Field Formats: All transaction fields must adhere to the specified format. If any field is malformed (e.g. wrong length or containing non-alphanumeric characters), the trans- action is invalid.
When a transaction is rejected due to any of the above rules, the node should not include it in its pool or any future block. Only transactions that pass all validation steps are pooled.
3.2 Block Creation and Addition
Periodically, the node will create a new block containing a batch of pending transactions from its pool and attempt to append this block to the blockchain via the consensus protocol. A node proposes a new block when its transaction pool becomes non-empty (triggering an attempt to form. the next block).
Each block is a JSON object with the following keys (see Section 5.2 for full schema):
• index: The block’s position in the chain (1 for the genesis block, 2 for the first block after genesis, etc.).
• transactions: A list of transaction objects, each containing the fields sender, message, nonce and signature.
• previous_hash: The current_hash of the previous block in the chain (for the genesis block, a predefined value is used – see Appendix B).
• current_hash: The SHA-256 hash of this block’s content (computed over the string representation of the block’s other fields, with keys sorted lexicographically for consis- tency).
When a node creates a block proposal, it must set the index to (current blockchain length + 1), include a selection of transactions from its pool, and set previous_hash to the hash of its latest block. It then computes the current_hash for the new block. This block will then enter the consensus protocol to decide whether it will be the next block appended to the chain.
After a consensus round (see Section 8), the nodes will have agreed upon one block to com- mit. When a node decides on a block for a given round, it appends that block to its local blockchain. The node should then update its state accordingly:
• Execute/confirm the block’s transactions (e.g. increment nonce counters for senders).
• Remove the block’s transactions from the transaction pool (since they are now con- firmed on-chain).
• If any other pending transactions in the pool became invalid due to this new block, those should be removed as well. For example, if multiple transactions from the same sender were in the pool and one of them was included in the block, any other transaction from that sender with the same nonce or an out-of-order nonce should be dropped.
The blockchain should always remain valid: a decided block’s previous_hash must match the hash of the previously decided block. All nodes start with the same genesis block, so if the consensus protocol works correctly, every node’s chain will evolve identically over time (aside from any blocks that crashed nodes failed to append).
4 Network Protocols and Message Formats
All network communication between nodes uses a simple message protocol over TCP. Nodes exchange messages that are length-prefixed and encoded in JSON. Each message sent over the network is preceded by a 2-byte integer length field (unsigned, big-endian byte order) that specifies the number of bytes in the message body. The message body immediately follows the length field and consists of a JSON-encoded object. The maximum message length is 65535 bytes (0xFFFF), as the length is 16-bit. This framing allows the receiver to know how many bytes to read for each message from the continuous TCP stream.
The message body is a JSON object with two top-level keys: "type" and "payload". The "type" field is a string indicating the kind of message, and "payload" contains the data relevant to that message. There are two types of messages in this protocol:
• "transaction" — used to submit a transaction to a node.
• "values" — used during consensus to exchange proposed block values (hence “val- ues” refers to proposed blocks).
4.1 Transaction Message Format
A message with "type": "transaction" carries a single transaction to be processed. The payload in this case is a JSON object representing the transaction. The transaction JSON has the following keys:
• sender (string): The hex-encoded 32-byte Ed25519 public key of the sender (i.e. the account initiating the transaction). This is a 64-character lowercase hexadecimal string. This public key uniquely identifies the user/account.
• message (string): Arbitrary UTF-8 text authored by the sender. The consensus protocol treats this as opaque data. The message contains up to 70 alphanumeric characters and spaces.
• nonce (integer): A number indicating how many transactions the sender has already had confirmed in the blockchain before this one. The nonce starts at 0 for each sender’s first transaction, and increments by 1 with each confirmed transaction. This field is used to ensure each transaction is unique, and for a given sender (preventing replay attacks or double-spending).
• signature (string): The hex-encoded 64-byte Ed25519 signature of the transaction. This signature is generated by the sender’s private key over the contents of the transac- tion (the transaction fields sender, message and nonce concatenated in this order). It is 128 characters in hexadecimal. The signature proves the authenticity of the trans- action (that it was indeed created by the holder of the private key corresponding to sender).
When a node receives a "transaction" message, it should parse the payload as a trans- action dictionary and run the validation rules (nonce check, signature verification, etc.) as described in Section 2.1. After processing, the node should return a response to the sender (outside the scope of this message format, a simple True or False as mentioned). The original message does not itself demand a response in JSON format, but the sending function will expect a boolean acknowledgement.
Example — Transaction Message: Suppose a user with public key "a578 . . .5363b" wants to log the message "Never Gonna Give You Up" with nonce 0. The user signs the trans- action, producing signature "142e . . . 60c7fd0c". The network message sent to the node would have the following JSON body (length prefix omitted here for clarity of the output):
1 {
2 "type " : "transaction " ,
3 "payload " : {
4 "sender " : "
a57819938feb51bb3f923496c9dacde3e9f667b214a0fb1653b6bfc0f185363b " ,
5 "message " : "Never Gonna Give You Up " ,
6 "nonce " : 0 ,
7 "signature " : " 142e395895e0bf4e4a3a7c3aabf2 ..... (and so on) "
8 }
9 }
|
The first two bytes of the actual transmission would be the length of this JSON string in bytes (in big-endian). The node, upon receiving these bytes, will reconstruct the JSON and interpret it accordingly.
The node’s response to this message (over the same TCP connection) should be a standalone JSON value true (if accepted) or false (if rejected), or simply the literal True/False as a Python boolean serialised.
4.2 Block (Values) Message Format
During the consensus protocol, nodes exchange messages of "type": "values". These messages carry block proposals (or requests for them). The "payload" for a "values" message is a JSON structure that can vary depending on context:
• In our implementation, every "values" message carries the sender’s own current block proposal (a list containing its block, or an empty list if it has none yet). Send- ing your proposal thus implicitly requests that the peer reply with its own "values" message containing that peer’s proposal.
• When responding or broadcasting values, the payload is a list of block objects. Each block object in the list is a JSON dictionary with the fields described above (index, transactions, previous_hash, current_hash). In our consensus algorithm (a single round to decide one block), each node will have exactly one block proposal per round. Thus, you may often send or receive a "values" message where the payload list contains either one block (the peer’s proposal) or multiple blocks (if a node bundles all proposals it knows). In a fully synchronous broadcast approach, a node might send its own proposal to everyone (payload list of one), and also later send a list of all proposals it received (payload of many) — but our simplified approach mainly uses one exchange of single blocks.
Every block in the payload list should be formatted as valid JSON with its keys and values as described. For completeness, here’s a breakdown of a block object (these apply whether the block is sent in a values message or printed to stdout):
• index: (integer) Block’s position in chain.
• transactions: (array) List of transaction objects included in the block.
• previous_hash: (string) The hash of the previous block in hex.
• current_hash: (string) The hash of this block in hex.
Example — Values Request: Node A wants to request Node B’s proposed block for the current round. Node A sends a message to B:
1 { "type " : "values " , "payload " : [ Block_A ] }
|
and receives the following response:
1 { "type " : "values " , "payload " : [ Block_B ] }
|
This indicates a request. When a node receives a values request (empty list), it must respond with a values message whose payload is its own current proposal (or [] if it has none).
Example — Values Response (Single Block): Node B has a block proposal (say Block_B) ready. It responds to A with:
1 {
2 "type " : "values " ,
3 "payload " : [
4 {
5 "index " : 2 ,
6 "transactions " : [
7 {
8 "sender " : "b1c22f...5d8e " ,
9 "message " : "Touch grass " ,
10 "nonce " : 0 ,
11 "signature " : " 8f23ab...45d1 "
12 }
13 ] ,
14 "previous_hash " : " " ,
15 "current_hash " : " "
16 }
17 ]
18 }
Here, the payload list contains one block object (Node B’s proposal for block #2).
Example — Values Broadcast (Multiple Blocks): If Node A, after collecting proposals from B (and maybe others), decides to broadcast the full set, it might send:
1 {
2 "type " : "values " ,
3 "payload " : [
4 { ... Block_A ... } ,
5 { ... Block_B ... } ,
6 { ... Block_C ... } 7 ]
8 }
|
with each block from a different node. In all messages, JSON syntax must be strictly followed. That means:
• Keys and string values in quotes " ".
• Lists [ . . . ] and objects { . . . } with commas between elements.
• No trailing commas.
• True/False in JSON should be lowercase true/false if they appear.
• Numeric values (nonce, index) appear as numbers (no quotes). In summary, design your networking code such that:
• Before sending, you prepend the 2-byte length.
• You then send the JSON text of the message.
• When receiving, you first read 2 bytes to get N, then read N bytes to get the JSON message.