Welcome to M.A.S.P.
(Morbus Alzheimerii Sistēma Prognōstica)


A state-of-the-art prediction system that determines if a person has Alzheimer's disease


View Now!
Date Published: June 30, 2023

Why Predict Alzheimer's Disease?

This is a great place to initially find out about your situation relating to Alzheimer's disease. This system will use an array of highly precise AI models to best possibly answer whether or not you have Alzheimer's. The system was trained on this Kaggle dataset. The system takes inputs about your demographics and brain features (age, estimated total intracranial volume, atlas scaling factor, etc.) and outputs that you either have Alzheimer's or do not.


Alzheimer's disease is a progressive disease in which brain cell connections and the cells themselves degenerate and die, eventually destroying memory and other important mental functions. Memory loss and confusion are the main symptoms. No cure exists, but medications and management strategies may temporarily improve symptoms. When Alzheimer's is diagnosed early on, treatments are more likely to be effective!

Some notable statistics about Alzheimer's disease:

  • More than 6 million Americans of all ages have Alzheimer's.
  • About 1 in 9 people age 65 and older has Alzheimer's.
  • Almost 2 out of 3 Americans with Alzheimer's are women.
  • Deaths from Alzheimer’s have more than doubled between 2000 and 2019.
  • Source: Alzheimer's Association

Resources for Additional Information:

Brain

Exploration


Pie Chart

This pie chart supports the fact that there are more nondemented persons found in the dataset compared to demented.


MMSE vs. Age Scatter Plot

This scatter plot shows how people within the MMSE range of 25 and below have a higher chance to develop Alzheimer's whereas patients above 25 are more likely to be Nondemented.


Age and Group Histogram

This histogram shows that most Alzheimer's disease cases appear in people aged 65 years old through 85 years old.


3D Scatter Plot

This 3D scatter plot shows that, as eTIV goes up, ASF goes down and vice versa. Also, nWBV ranges from approximately 0.65 to 0.83.


Violin Graph

This violin graph shows that any person that has a MMSE value that is lower than 25 has Alzheimer's disease. In addition, any person that has a CDR value of 1 or more has Alzheimer's disease, but they can have a lower value and still be diagnosed with the disease. On the other hand, any person that has a CDR score of 0 does not have Alzheimer's disease. On the other hand, a person can have a CDR value of 0.5 and not be diagnosed with the disease.


Heatmap

This heatmap analyzes the mathematical relationships between all the variables used by the ML models to predict whether a patient has Alzheimer's disease or not. The closer a number is to 1, the more closely related the two variables that made that number are. On the other hand, the opposite effect happens the closer a number is to -1. The audience should pay special attention to the 'Outcome' variable, which is the one that says whether a patient has Alzheimer's disease or not. Clinical dementia rating (CDR) and mini mental state examination (MMSE) seem to be the most influential features on whether or not a person has Alzheimer's.

Machine Learning Models

We used a variety of different machine learning models with various results including:

    K-Nearest Neighbors (KNN):
    How It Works

    KNN works by finding the distances between a new data point and old data points. Depending on what the closest old data points are classified as, the new point will get classified accordingly. The amount of old data points it will look at depends on the specified number K. This model had a 64% accuracy.

    KNN

    KNN Results


    Support Vector Classifiers (SVC):
    How It Works

    SVC maps the data to a higher dimensional space and then finds the optimal hyperplane that has the highest margins between the data points and the hyperplane. This model had a 98% accuracy.

    SVC

    SVC Results


    Random Forest Classifier (RFC):
    How It Works

    RFC creates a randomly generated number of datasets that vary in size. It creates a decision tree from each new dataset. It then collects votes from each decision tree for which category the new data point should belong in. Whatever category has the most votes, the data point gets placed in that category. This model had a 100% accuracy.

    RFC

    RFC Results


    Logistic Regression Classifier (LRC):
    How It Works

    LRC makes predictions based on the Sigmoid function which is a squiggles-like line. Despite the fact that it returns the probabilities, the final output would be a label assigned by comparing the likelihood with a threshold, which makes it eventually a classification algorithm. This model had a 98% accuracy.


    LRC Results


Conclusions

We were not expecting our models to be so accurate when predicting whether or not a person had Alzheimer's based on our training data. We were pleasantly surprised to see such high accuracies, and further hyperparameter tuning for these models ensured that our models were more generally applicable to other data. RFC ended up being our best machine learning model for this classification system, so we ended up using that model for the backend of our Alzheimer's Prediction Form. However, the fact that SVC and LRC were nearly just as accurate was interesting. Perhaps because our dataset was small (only 317 instances to train and test on), these models ended up being highly accurate. Demographic and brain features (such as estimated total intracranial volume and normalize whole brain volume) prove to be highly effective in determining whether people have Alzheimer's disease.

Such AI detection systems are easy to use and can help patients seek treatments early -- before Alzheimer's disease causes serious harm.

Conclusion Scores

Meet the Team!

Each M.A.S.P. team member possesses a diverse skill set that actively contributed to every role, which included being a product manager, data scientist, machine learning specialist, and web designer. Everyone had shared responsibilities, so there are no specified roles. The team ultimately used effective collaboration and communication skills to reach a final product that satisfies them all.

Card image cap
Chetan Khairnar

A 22-year-old Master Student from Texas, currently pursing masters in Computer Science with aim to make development in field of Data Science

Card image cap
Jacob Hanson

A 17-year-old high school student from Wisconsin, is currently dedicated to his studies with the aim of pursuing a future career as an Information Security and Data Analytics professional

Card image cap
Isaiah Johnson

A 15-year-old high school student from Wisconsin whose main interests reside in computer science, including data analytics and game design

Card image cap
Ashni Kumar

A 17-year-old high school student from Pennsylvania who is interested in exploring computer science fields and their applications

Card image cap
Heath Fry

A 17-year-old high school student from North Carolina, who is diligently pursuing his studies with the aim of securing a career in the field of data science

Card image cap
Francisco Apraiz

A 16-year-old Argentinian high school student now living in Miami, Florida. Francisco aspires to study computer science and finance and loves to play tennis