Photo by Morgan Housel on Unsplash

Analysis the continuous data points using Box plot.

Suppose ,we take example of Frequency of fruits.

Distribute of frequency is

Frequency Distribution

First of all arrange into sorted order.

Sorted List

1- Find Median / Middle Value [Median always found in sorted order]

Median of List

Median is 33.5 of above Fruits data distribution.

data Distribution

50% of data in the left side of median and 50% of data in the right side of the median.

Median is 33.5

data Distribution

2- Now, find First quartile Q1.

quartiles

Above, 21 is come under first quartile. Q1

57 is come under second quartile. Q2

It means -

quartiles

25% data of complete distribution is left side of 21

75% data of complete distribution is right side of 21

It mean

25% of data is left of Q1.

75% of data is right of Q1.

75% data of left of Q3.(Q3=57)

25% data of right Q3.(Q3=57)

Data Distribution

3- Find Min -Max

According to box plot analysis .

MIN=15

Q1=21

Q2=33.5 [Median]

Q3=57

IQR=Q3-Q1

Roughly estimate-

Roughly estimate

4- Analysis the plot and data.

  • Median is 28 .its mean 50% peoples that height is less than 28 and 50% people that height more than 28.
  • Q3 quartiles 3 mean 75% peoples that height is less than 33 and 25% peoples that height is more than 33.
  • Q1 quartiles 1 mean 25% peoples height less than 19 and 75% peoples height more than 19.
  • Size of box is called IQR .
  • Points above of max value are consider outliers.
  • Points below of min value are consider as outliers.

5- How to find outliers and the max and min value.

For it , We calculate an upper fence as well as lower fence anything within the fence is not consider an outliers and anything beyond the fence is an outlier.

Calculate Upper fence and Lower fence.

Q3=33

IQR=Q3-Q1

IQR=33–19=14

upper fence = Q3+1.5 (IQR)

33+1.5(14)=54

Lower fence = Q1–1.5 (IQR)

19–1.5(14)=2

It mean anything above 54 consider an outlier and anything below 2 consider an outlier.

If any doubt regarding it. please comment below and ask feel free.

If any doubt regarding this tutorial ask feel free on LinkedIn- http://linkedin.com/in/puneet166

GitHub workspace link- https://github.com/puneet166?tab=repositories

--

--

--

Data Science , Machine Learning , BlockChain Developer

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Databricks —An Overview

Covid Vaccine Stock Prediction

Kabbalistic Structures of Knowledge

Predicting Incumbent Party Vote Share

BIG DATA IN THE CLOUD, BIG DATA ALL AROUND

Statistical Inference: saving time and money while making robust conclusions

Getting closer to the flight dynamics by using the linear representation

practical questions versus curiosities

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Puneet Singh

Puneet Singh

Data Science , Machine Learning , BlockChain Developer

More from Medium

KERNEL SVM

Introduction to Statistics

CREDIT CARD FRAUD DETECTION: A CONTEMPORARY PERSPECTIVE IN THE FIELD OF DATA SCIENCE

Detecting Breast Cancer with Artificial Intelligence