Distributed file system coursework
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Distributed file system coursework
Course: COMP2207
1 Introduction
In this coursework you will build a distributed storage system. This will involve knowledge
of Java, networking and distributed systems. The system has one Controller and N Data
Stores (Dstores). It supports multiple concurrent clients sending store, load, list, remove
requests. Each file is replicated R times over different Dstores. Files are stored by
the Dstores, the Controller orchestrates client requests and maintains an index with the
allocation of files to Dstores, as well as the size of each stored file. The client actually
gets the files directly from Dstores – which makes it very scalable. For simplicity all these
processes will be on the same machine but the principles are similar to a system
distributed over several servers. Files in the distributed storage are not organised in
folders and sub-folders. Filenames do not contain spaces.
The Controller is started first, with R as an argument. It waits for Dstores to join the
datastore (see Rebalance operation). The Controller does not serve any client request
until at least R Dstores have joined the system.
As Dstores may fail and be restarted, and new Dstores can join the datastore at runtime,
rebalance operations are required to make sure each file is replicated R times and files
are distributed evenly over the Dstores.
2
2 Networking
The modules will communicate with each other via TCP connections. Because they could
be on the same machine for testing, the datastores will listen on different ports.
3 Code development
Only use Java openjdk-14-jdk-headless, on Linux/Unix. Do not use Windows. The code
must be testable and not depend on any IDE directory structure/config files.
Command line parameters to start up the system:
Controller: java Controller cport R timeout rebalance_period
A Dstore: java Dstore port cport timeout file_folder
A client: java Client cport timeout
Where the Controller is given a port to listen on (cport), a replication factor (R), a
timeout in milliseconds (timeout) and how long to wait (in milliseconds) to start the next
rebalance operation (rebalance_period).
The Dstore is started with the port to listen on (port) and the controller’s port to talk to
(cport), timeout in milliseconds (timeout) and where to store the data locally
(file_folder). Each Dstore should use a different path and port so they don’t clash.
The client is started with the controller port to communicate with it (cport) and a timeout
in milliseconds (timeout).
3
Store operation
• Client -> Controller: STORE filename
• Controller
o updates index, “store in progress”
o Selects R Dstores, their endpoints are port1, port2, …, portR
o Controller -> Client: STORE_TO port1 port2 … portR
• For each Dstore i
o Client->Dstore i: STORE filename filesize
o Dstore i -> Client: ACK
o Client->Dstore i: file_content
o Once Dstore i finishes storing the file,
Dstore i -> Controller: STORE_ACK filename
• Once Controller received all acks
o updates index, “store complete”
o Controller -> Client: STORE_COMPLETE
Failure Handling
• Malformed message received by Controller
o Log the error and continue
• Malformed message received by Client
o Log the error and continue
• Malformed message received by Dstore
o Log the error and continue
• If filename already exists in the index
o Controller->Client: ERROR ALREADY_EXISTS filename
• Client cannot connect or send data to all R Dstores, or the Controller does not
receive R acks within a timeout
o Log the error and continue; future rebalances will sort things out