banner

banner

Terms of Use

Home

Registration

Login/Download

 

SIEMENS

Siemens Medical Solutions USA

Computer Aided Detection (CAD)

Soarian Quality Measures

 

KDD CUP 2008

Background on Breast Cancer

Data Description

Challenge Description

Hints

Workshop on Mining Medical Data

Important Dates

Contact/FAQ  

KDD Cup 2008: Challenges

 

We propose to conduct two different yet closely related challenges based on this data. On the test data, the participants have to return two different files, one corresponding to each challenge.

 

1.      The rate of prevalence of malignant patients in a screening environment is extremely low (on average only around 5-10 patients out of 1000 screening patients have breast cancer). Therefore, in the first challenge, the participating entries will be judged in terms of the area under the FROC curve in the clinically relevant region 0.2-0.3 False positives per image. To support this, the participants have to return a file with a confidence score for every candidate of the test set (from –infinity to + infinity) that indicates the confidence of their classifier that the candidate is malignant. A score of +infinity corresponds to absolute confidence that the candidate is malignant, and a score of –infinity indicates absolute confidence that the candidate is benign.

 

2.      In the second challenge, our aim is to reduce the workload for radiologists, by asking them to only read a subset of cases that the algorithm deems at least somewhat unclear or suspicious. Thus our second challenge is evaluated in terms of the fraction of patients who are labeled as completely normal (not requiring radiologist review of images) such that the CAD algorithms have a 100% sensitivity of the malignant patients. (CAD systems which fail to have a 100% sensitivity will be disqualified from the challenge). To support this challenge, the participants have to return another file with a binary classification decision about whether each patient in the test set should be reviewed by a radiologist.