Colton Blackwell

March 2024

scRNA-seq Analysis

The KMeans clustering algorithm was implemented and applied to a dataset of human pancreas tissue, first reducing its dimensionality using PCA. Utilizing both random and KMeans++ initialization, we explore different numbers of clusters (k ranging from 2 to 10) to identify the optimal clustering configuration, assessed through silhouette coefficients. Lastly, we visualize the clusters in two dimensions using scatter plots to aid interpretation of the clustering outcomes.

Responsibilities

Implemented the K-Means clustering algorithm from scratch in Python, incorporating methods for centroid initialization, updating, and silhouette coefficient computation.
Processed and reduced the dimensionality of a single-cell RNA sequencing (scRNA-seq) dataset using PCA before applying the clustering algorithm.
Utilized K-Means++ initialization to improve clustering quality and evaluated clustering results using silhouette coefficients for different values of k.
Visualized clustering outcomes with a scatter plot, showcasing the best k-value clusters determined through both random initialization and K-Means++ initialization.

Skills Developed

Enhanced Python proficiency covering syntax, data structures, and NumPy for efficient numerical operations.
Acquired hands-on experience in implementing K-Means clustering, focusing on centroid initialization, updates, and Euclidean distances.
Developed data preprocessing skills for scRNA-seq datasets.
Learned PCA for dimensionality reduction, applied clustering techniques.
Utilized Matplotlib for visualization, interpreted silhouette coefficients.
Improved documentation for code and report preparation.

Technologies

Python
- Matplotlib
- numpy
- scanpy
- sklearn

Colton Blackwell

Personal Portfolio and Projects

scRNA-seq Analysis

Responsibilities

Skills Developed

Technologies

Learn More