Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Overview
This assignment has multiple goals:
? Implement a priority queue backed by a binary Max-heap
? Practice using Sets
? Implement a program using multiple interacting data structures
? Learn about performance optimization strategies including parallelization.
? Implement memoization for computationally expensive function calls.
? See the importance of the above performance optimization techniques.
You will implement a program that finds a Wikipedia link ladder between two given pages.
A wikipedia link ladder is a list of Wikipedia links that one can follow to get from the start
page to the end page. It is based on the popular game WikiRace, which you can play online
here. The objective is to race to get to a target Wikipedia page by using links to travel from
page to page. The start and end page can be anything but are usually unrelated to make the
game harder.
Assignment
This assignment is broken up into different parts. But before breaking it into separate parts, it
is helpful to get a large overview of the assignment. A broad pseudocode of the algorithm is
as follows:
1
To find a ladder from startPage to endPage:
Make startPage the currentPage being processed.
Get set of links on the currentPage.
If endPage is one of the links on the currentPage:
We are done! Return the path of links followed to get here.
Otherwise visit each link on currentPage in an intelligent way and search
each of those pages in a similar manner.
Part One: Internet Connectivity
For this assignment we have provided some starter code. Part of this assignment requires
fetching webPages in order to be able to look at the HTML to find the various links on a page.
You are responsible for testing this on your machine by Wednesday 11/13. We will not help,
debug, or even look at issues with this code after that date. One of the benefits of Java, is that
it should work independent of what kind of computer it is running on, but as this is the first
time this assignment has been given in Java, it is important to test this crucial aspect of the
assignment. I will not detail how to test this part of the assignment. That is your job. Read
through the starter code to understand which method is responsible for fetching the HTML,
test the inputs/outputs to this method to make sure they work as expected. This part tests
code reading/understanding/testing. The only hint I would like to mention is that most, if
not all, web browsers allow you to see the HTML of any webpage.
Part Two: Scraping the HTML to find links
Congratulations! You have already finished this part. This is what you did for the drill.
But note the function name has changed, since you need to do a few other tasks before
finding the actual wikiLinks. Fill in the private static Set<String> scrapeHTML(String
html) method with the solution you came up with for the drill. As a reminder, this function
goes through a String of the HTML of a webpage and returns a Set<String> which contains
the names of all of the valid wikiLinks found in that HTML. For our purposes, a valid link is
any link that:
? is of the form ”/wiki/PAGENAME”
? PAGENAME does not contain either of the disallowed characters ’#’ ’:’
You have examples in the drill testing code. Another example follows the brief HTML description
below.