Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Recall the guidance regarding plagiarism in the course introduction: this applies to this assignment, and if evidence
of plagiarism is detected, it will result in penalties ranging from loss of marks to suspension.
The dataset and breast cancer domain description in the Background section are from the assignment developed by
Peter Lucas, Institute for Computing and Information Sciences, Radboud Universiteit.
Introduction
In this assignment, you will develop some sub-routines in Python to implement operations on Bayesian Networks.
You will code an efficient independence test, learn parameters from complete data, and classify examples.
We will use a Bayesian Network for diagnosis of breast cancer. We start with some background information about
the problem.
Background
Breast cancer is the most common form of cancer and the second leading cause of cancer death in women. Every 1
out of 9 women will develop breast cancer in her lifetime. Although it is not possible to say what exactly causes
breast cancer, some factors may increase or change the risk for the development of breast cancer. These include
age, genetic predisposition, history of breast cancer, breast density and lifestyle factors. Age, for example, is the
most significant risk factor for non-hereditary breast cancer: women with age of 50 or older have a higher chance of
developing breast cancer than younger women. Presence of BRCA1/2 genes leads to an increased risk of developing
breast cancer irrespective of other risk factors. Furthermore, breast characteristics, such as high breast density are
determining factors for breast cancer.
The primary technique used currently for detection of breast cancer is mammography, an X-ray image of the breast.
It is based on the differential absorption of X-rays between the various tissue components of the breast such as fat,
connective tissue, tumour tissue and calcifications. On a mammogram, radiologists can recognise breast cancer by
the presence of a focal mass, architectural distortion or microcalcifications. Masses are localised findings, generally
asymmetrical to the other breast, distinct from the surrounding tissues. Masses on a mammogram are
characterised by several features, which help distinguish between malignant and benign (non-cancerous) masses,
such as size, margin, shape. For example, a mass with irregular shape and ill-defined margin is highly suspicious for
cancer, whereas a mass with round shape and well-defined margin is likely to be benign. Architectural distortion is
focal disruption of the normal breast tissue pattern, which appears on a mammogram as a distortion in which
surrounding breast tissues appear to be “pulled inward” into a focal point, often leading to spiculation (star-like
structures).