Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CS 179: Introduction to Graphical Models Homework As a simple model of noise, suppose that whenever we transmit (or, store and retrieve) a bit, there is an independent probability of error p that we will read the wrong bit value, i.e., if we stored a zero we read a one, and vice versa. (This is called the “binary symmetric channel” model.) Repetition codes. The simplest possible error correcting code is a repetition code, where we simply transmit multiple copies of the same bit. If we mean to send a 0, we can send “000”, and for 1 send “111”. Then, if one bit is transmitted incorrectly, we can still recover the correct value. The original “0” is called the data bit(s), while the transmitted “000” is called the codeword. Note that any error correction forces us to send extra bits; the ratio of the data length to the codeword length is called the rate, so this is a rate-1/3 repetition code. It is also easy to see that we can correct one error (e.g., decode 010 → 0) by simple majority vote, but cannot correct two errors (e.g., 110 → 1, since it is more probable this is 111 with a single bit error). Hamming codes. A more sophisticated approach uses more carefully chosen parity bits to add redundancy. The (7,4) Hamming code (for 4 data bits encoded into 7 code bits) is given by 0000 → 0000000 1000 → 1110000 0001 → 1101001 1001 → 0011001 where the blue digits indicate the redundant, parity check bit additions. See the table in https://en.wikipedia. org/wiki/Hamming(7,4) code for the full values, and an elegant “venn diagram” illustration of how the data bits and parity check bits are related. For the purpose of our problem, it is sufficient to know that (a) we transmit 7 bits for any block of 4 data bits, and (b) if we encounter an error in zero or one bits, we can recover from (correct for) the error, while errors in two or more bits will make at least one error in the four data bits. Problem 1: Probability Let us analytically evaluate the probability of making an error using either no code, a repetition code, or a Hamming code under various settings. We will send “blocks” of size B = 4 bits, and wish to recover these blocks error-free. For the purposes of this problem, let the probability p of an error be p = 0.05. (a) Compute the probability that, with no encoding at all, we will make at least one mistake in sending a block of four bits. Homework 5 UC Irvine 1/ 3 CS 179: Introduction to Graphical Models Spring 2020 (b) Now compute the probability that, with a rate 1/3 repetition code, we will make at least one error in the block of four bits. Note that we are now sending 12 bits: 3 copies of each of the four data bits; any single error in each of these four sets of 3 bits can be recovered from, but not two errors. (c) Now compute the probability that, with a Hamming (7,4) code, we will make at least one error in the four bits. (Again, we send 7 bits, and decode all 4 correctly if we have at most one error in the 7.) You should find that the repetition code is slightly better, in terms of the probability of error; however, this ignores the fact that it must transmit almost twice as many bits as the Hamming code. In order to make this a bit more comparable, let us suppose that we are willing to use rate 1/2, but no lower – so, we are willing to use 8 bits to send a block of size 4, but no more. Then, our repetition code can only afford to repeat two bits, and must send the other two unencoded. (d) Compute the probability of making at least one error if we use a repetition code for two bits, and send two bits unencoded. (So: single errors in the first 3 bits and second 3 bits are recoverable, but single errors in the last two bits are not.) The difference between these two codes, intuitively, is that in the repetition code we introduce redundancy for individual bits, while in the Hamming code we introduce redundancy that spans several bits. By distributing the redundancy better across the data bits, we can improve our ability to correct for (rare) errors. Problem 2: Simulations Let us next set up a basic framework for encoding a data block, simulating bit errors, and then decoding and checking for correctness. First, we will use it to confirm the Hamming result from the previous problem. In the homework template, you will find a function encode(D, checks) that takes a data block D and a set of parity checks checks and returns an encoded block T . Parity checks for a (7,4) Hamming code are also provided; for example, T4 equals the parity of D0 ∧ D1 ∧ D2 , and so on. Finally, a simple greedy_decode function can decode the codeword block. (It is written slightly more generally than necessary.) (a) Using the provided functions, encode the data [0,1,1,0] , and verify that the resulting codeword decodes correctly. (b) Try decoding after introducing a single bit error, and again after introducing a second bit error. Comment on your results. (c) Finally, a small simulation loop (also provided) will estimate the probability of an error in blocks of size B=4. Try this, and verify that it matches your analytic solution from Problem 1. Problem 3: Encoding Larger Blocks Elegant, hand designed codes like the Hamming code are used in many systems; compact disks use a class of codes called Reed-Solomon codes. But for many systems, graphical models provide the best error correction techniques. Convolutional codes are a type of code that works as a finite state machine, and is decoded using dynamic programming on a Markov chain; but here we’ll look at one of the best types of codes for large blocks: LDPC codes and loopy graphical models. (a) Suppose that we have a large block (say, B = 100) that we wish to transmit without error. If we encode this with 25 Hamming (7,4) codes, we will have an error if any of these 25 codes introduce an error. Making use of your result from Problem 1(c), compute (analytically) the probability of this happening. (b) Code provided in the template will generate a parity check array for this encoding. Run your simulation procedure on this (larger) code and verify that your esimate matches your analytic solution. LDPC codes use many parity checks randomly distributed through the data block in order to create a large amount of interconnected redundancy. To ensure that every bit has some redundancy, we will generate one check bit for each data bit, and connect it to two other randomly selected data bits. You can use the provided parity check generator to create a random LDPC code