My name is Richard Truong-Chau and I am currently a data scientist consultant at Maven Wave. I am passionate about applying machine learning to uncover meaningful insight in businesses and society.
This repository serves to track my progress in Data Science and to help organize my goals/projects. Feel free to get in touch if you want to collaborate or to speak more about my qualifications.
| Description | Datasets | Notes | Start Date | Last Update |
|---|---|---|---|---|
| Aircraft Wildlife Strikes | Wildlife Strikes from 1990-2015 | Explore Pandas and visualization techniques | 05.20.2018 | 05.22.2018 |
| Analyzing 911 Calls | 911 CSV | Pivot Tables and Timestamps | 06.11.2018 | 06.13.2018 |
| Financial Crisis | Data from Jan. 2006 to Jan. 2016 | Pandas’ DataReader | 06.13.2018 | 06.15.2018 |
| Data Science For Good: NYC Specialized High School Exam | Census Data & SHSAT Registration | Interactive Plots (Plotly) | 06.27.2018 | 07.05.2018 |
In this section, I am mainly focused on machine learning projects that would help me learn new techniques/theories in practice.
| Algorithm | Description | Datasets | Notes | Start Date | Last Update |
|---|---|---|---|---|---|
| Logistic Regression & Random Forest | Fatal Encounters with Police | US Census Data (4 files) and Fatal Encounters | Feature Engineering and visualizations | 05.31.2018 | 06.27.2018 |
| Linear Regression | Website or Mobile Apps | Ecommerce data | - | 06.20.2018 | 06.20.2018 |
| Logistic Regression | Ad click | Generated Advertisement Dataset | Heatmaps for Null Values and Get Dummies | 06.21.2018 | 06.22.2018 |
| K Nearest Neighbors | Predict Target Class | Generated Dataset | StandardScaler, Elbow Method | 06.22.2018 | 06.23.2018 |
| Decision Tree & Random Forest | LendingClub Loans | LendingClub data from 2007-2010 | - | 06.25.2018 | 06.27.2018 |
| Natural Language Processing | Yelp Review | Yelp Review Ratings | - | 06.28.2018 | 07.05.2018 |
| PCA & Random Forest | Predict Costa Rican Poverty Levels | Kaggle | - | 07.20.2018 | 07.26.2018 |
| Random Forest & Logistic Regression | Predict Monetary Damage | NOAA Severe Earthquake Dataset | - | 12.06.2018 | 12.12.2018 |
In this section, this is no general theme for the projects. It is just a showcase of my research and new tools that I wanted to test.
| Goal | Description | Datasets | Notes | Start Date | Last Update |
|---|---|---|---|---|---|
| Characterize Mutations | Bioinformatics Research | Entire Salmonella Genome | Regular Expressions | 08.23.2018 | 05.3.2018 |
| Design, create, and implement a database | Biological Database | Files in PDF format | - | 04.20.2018 | 05.09.2018 |
| Webscrape | Webscrape IMDB Top 250 Movies | - | Beautifulsoup, request | 05.25.2018 | 06.02.2018 |
| Clothing Recommendation | Clothing classification and recommendation | Scraped data | Flask, Docker, Word Embeddings and Image Processing | 12.26.2019 | 12.30.2019 |
Interesting sources that I used along the journey.
I welcome any inquires or collaborations.