Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Interfaces and Exceptions
The Caesar cipher
The Caesar cipher is an ancient method of encrypting text, i.e. attempting to transform text into a
format that is unreadable by others.
The Caesar cipher is a ‘rotation cipher’ and operates by translating each letter into the one that is
shied along the alphabet by a fixed distance. This distance is called the shi. It is the same for all
letters in the alphabet and therefore can be seen as the secret key to encrypt and decrypt: To encrypt
your text using a given shi, you translate letters by that many places later in the alphabet. A Caesar
cipher with shi 3 can be illustrated as follows.
For example, if your text to encrypt is ‘Meet me at midnight under the bridge’ and your shi is 3, the
encrypted text is ‘Phhw ph dw plgqljkw xqghu wkh eulgjh’, as the letter ‘b’ gets translated into an ‘e’,
and ‘e’ gets translated into ‘h’ and so on. We ‘wrap around’ at the end of the alphabet, so a ‘z’ gets
changed to a ‘c’ given a shi of 3. We can interpret a negative value for the shi as translating letters
backwards (e.g. an ‘f’ gets encrypted as the letter ‘b’ if the shi is 4).
It is believed that Julius Caesar actually used such a cipher for his correspondence. Unfortunately for
him, this type of cipher is easily broken using a frequency analysis method. This method is outlined
below and your assignment is to implement it in Java.
Cracking the Caesar cipher.
Suppose we are given a cipher text, i.e. text that has already been encrypted with some (unknown)
shi, and we want to determine the original unencrypted text (typically referred to as the plaintext).
To reconstruct the original text we could decode with each of the 26 possible shis, and take the result
that looks ‘closest’ to an English sentence.
How do we measure ‘closeness’? This is where letter frequencies and a small statistical trick comes
in. First of all, we know how oen each letter occurs on average in English text. For instance, ‘e’ is the
most common letter, then ‘t’, and so on. To decode our cipher text, we can compute the frequencies of
each letter as they appear in the text. To measure how ‘close’ these are to the known English letter
frequencies, we use the χ2-score (that is the Greek letter ‘chi’, so it’s the ‘chi-squared score’). This score
is defined as
χ2 =Xz
α=a
(freqα Englishα)2
Englishα
where freqα denotes the frequency of a letter α (the number of occurrences divided by the total number
of letters in the text), and Englishα
is the corresponding English letter frequency. In other words, we
sum the fraction over all 26 possible letters to determine this score.
The χ2
score will be lower when the frequencies are closer to English. Note that when we do this, we
are ignoring the case of letters (we want to treat upper and lower case equally for our purposes).
We will be using the following known frequencies for the letters in English, given here in a Java-style
array arranged in the usual alphabetical order (see frequencies.java, you’re welcome).
1 double[] freq = {0.0855, 0.0160, 0.0316, 0.0387, 0.1210, 0.0218,
2 0.0209, 0.0496, 0.0733, 0.0022, 0.0081, 0.0421,
3 0.0253, 0.0717, 0.0747, 0.0207, 0.0010, 0.0633,
4 0.0673, 0.0894, 0.0268, 0.0106, 0.0183, 0.0019,
5 0.0172, 0.0011};
Requirements
For this exercise you will develop two programs. The first one rotates a given plain text for a given shi.
The second one will crack a given cipher text and display the original plain text on the screen.
a. (15%)Write a Java Interfacefor rotation ciphers. The interface should be calledRotationCipher
and declare the following public methods.
rotate, which should take a string s and an integer n as parameters and return the string
s rotated by shi n.
decipher, that translates a (cipher text) string to a (plain text) string.
frequencies, which computes the letter frequencies in a given string (as size-26 arrays of
double, as displayed above).
chiSquared, which returns the χ2
-score (a double) for two given sets of frequencies
b. (25%) Implement your interface in a class called Caesar. Note the hints and comments below
to help you in case you’re lost.
You may assume that the original plain text is in English and that the cipher text has been
produced using a Caesar cipher by some (unknown) shi. This shi has only been applied to
letters, not anything else that may appear in the text, so spaces are spaces, any punctuation is
le untouched, etc. Lower case letters are transformed to lower case letters. When computing
letter frequencies you need not distinguish between lower case and capital letters.
c. (15%) Write an application called Rotate, which rotates a given (plain text) string by a given
shi. This should make use of your Caesar class from part b). The shi as well as the input
string should be read as the first and second command line parameters, respectively. The (only!)
output should be the rotated string in case all goes well. See below for sample outputs.
$ java Rotate 3 "The ships hung in the sky in much the same way that bricks don't."
Wkh vklsv kxqj lq wkh vnb lq pxfk wkh vdph zdb wkdw eulfnv grq'w.
$ java Rotate -13 "The ships hung in the sky in much the same way that bricks don't."
Gur fuvcf uhat va gur fxl va zhpu gur fnzr jnl gung oevpxf qba'g
$ java Rotate 13 The ships hung in the sky in much the same way that bricks don't.
Too many parameters!
Usage: java Rotate n "cipher text"
$ java Rotate 13
Too few parameters!
Usage: java Rotate n "cipher text"
d. (15%) Write an application called BreakCaesar, which finds out and prints a plain text message
for a given cipher text string. As in part c), the input string should be read as (the only) command
line parameter. Sample output below.
$ java BreakCaesar "Vg vf n zvfgnxr gb guvax lbh pna fbyir nal znwbe ceboyrzf whfg jvgu
cbgngbrf."
It is a mistake to think you can solve any major problems just with potatoes.
$ java BreakCaesar
Too few parameters!
Usage: java BreakCaesar "cipher text"
$ java BreakCaesar Too Many Parameters
Too many parameters!
Usage: java BreakCaesar "cipher text"
e. (10%) Draw an UML class diagram that includes all classes and interfaces that you’ve written.
This should be part of your report.
f. (10%) Document your code (i.e., all interfaces, classes and methods) using javadoc comments.
Generate HTML documentation and put the generated files in a separate directory called ‘docs’.
g. (5%) Caesar would have written in Latin instead of English. What would we do dierently if we
know the language we’re examining isn’t English but some other language?
h. (5%) Suppose we (somehow) know that the person doing the encryption uses one shi value
for lower case letters, and a dierent shi value for upper case letters. What would we have
to do dierently? How would that aect our calculations, or how would we have to alter our
program/calculations to account for this?
Parts g) and h) are bonus questions and require no code to be written. Just comment on these questions
in your report.