CatBoost vs XGBoost - Quick Intro and Modeling Basics
- Some Python and Modeling experience/interest
XGBoost is one of the most powerful boosted models in existence until now… here comes CatBoost. Let’s explore how it compares to XGBoost using Python and also explore CatBoost on both a classification dataset and a regression one. Let’s have some fun!
We’re going to start by unleashing XGBoost and CatBoostost on an independent data set version of the Titanic - the ship’s manifest of those that did and didn’t survive the tragic sinking of the ship in the North Atlantic Ocean. It happened in 1912 after hitting an iceberg on its maiden voyage to New York. You probably have already used it as it is extremely predictive, basically, women, children and the rich survived while men and the poor mostly didn’t.
In the second part, we’ll model a linear regression and classification on the titanic for classification and the Boston housing data.I’ll also introduce you to a cool tool - Pandas Profiler for quick EDAs.
Please go out and use this model on a Kaggle competition, get an account if you haven’t already and experiment - sometimes follow the rules, sometimes, don’t. Remember that data science is very new so we’re still inventing things as we go, just like these new models allow us to explore a little further and further each time!
Who this course is for:
- Beginning data scientists