# Valentines Titanic EDA

As today is valentines and the Titanic movie will be watched thousands of times by couples around the world… I decided to do some Exploratory Data Analysis (EDA) for you, so that you can throw around fun facts, while watching the thing. Have a great day, and happy valentines!

## Setting up the project

First of all, we need to do some imports:

``````import matplotlib.pyplot as plt
import seaborn as sns  # really great module for graphs
import pandas as pd
``````

Then we get the dataset and setup sns. As you can see, I already selected the `index_col` and changed some data to categorical for faster loading.

``````# Shortened link, the data is taken from github datasciencedojo/datasets
csv_url = 'https://bit.ly/2U487zA'
index_col='PassengerId',
dtype={'Sex':'category', 'Pclass':'category', 'Survived':'bool'},
)

sns.set(color_codes=True)
``````

Ok, now we are ready to explore the Titanic data!

## Exploring the data

1. What percent survived the catastrophe, depending on sex and cabin class?

From the above picture we can see that:

• Being a woman in 1 or 2 class, you had greater chances to survive than in class 3,
• Being a man, the chances you will drown were higher than you survive in any class,
• Probably connected to the fact, that women had “first go” into life boats.

2. Distribution of passenger status depending on their age and cabin class.

From the graph we can see a few things. First of all, the third class had the most passengers in it. Moreover, most of the passengers there, were so called “young adults”, which didn’t end too good for them since the class had the highest mortility rate. Most children were in this class as well, as contrary to the first class, where only a handful of children were present.

3. Survival distribution with age.

The survival rate wasn’t really dependant on the age, as the average is the same in both cases. The whiskers aren’t that much different as well… So no matter how old you were, you could end up surviving… or the opposite.

4. Did the of the ticket matter?

Generally, the higher the ticket price, the higher changes you had to survive, although it wasn’t a rule. It is worth noticing that both two highest prices survived along with the oldest member on the ship. The chances were increased, also if you were a child.

5. Corelations!

6. Kernel Density Estimation (KDE)

KDE for survivability of passengers:

KDE for cabin class of passengers:

KDE for sex of passengers:

7. What was the average ticket price for each class?

## Happy valentines!

Hope the EDA I have shown you will be useful for you or at least you will go “ooh” when thinking about the Titanic catastrophe. Have fun tonight all of you!