IMPROVE MACHINE LEARNING MODEL PERFORMANCE -with Hyper-parameter tuning

Photo by Arvin Mantilla on Unsplash

Are you worry about your logistics regression not perform well with your data.

Or you missing some thing when you implement logistics regression.

Or if you want to improve performance of your logistic regression.

Don’t worry you are on Right place.

We will cover all these topics ..

  1. Implement logistics regression with some random parameter.
  2. Then we will check the accuracy with default parameter.
  3. We will try to improve accuracy of logistics regression using hyper-parameter tuning.
  4. After apply hyper parameter tuning we will check the accuracy once again.

First of all Download Dataset from this link -https://github.com/puneet166/ML_project/blob/master/titanic/FileName.csv

This dataset is already cleaned. so no need of preprocessing , feature engineering , feature extractions and all on it.

Dataset look like-

So here we go -

Step 1-

import pandas as pdimport numpy as npda=pd.read_csv('FileName.csv')

Importing necessary libraries and dataset.

Step 2-

from sklearn.preprocessing import MinMaxScalerscaler = MinMaxScaler()da[['Age','Fare']]=scaler.fit_transform(da[['Age','Fare']].values)

Performing little bit feature scaling mix-max scale on numeric data.

(Age),(Fare).

After performing feature scaling on data.

Step 3-

x=da['Survived']y=da.iloc[:,1:8]

Divide the dataset into dependent or independent features for further processing.

Step 4-

from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import metrics

Import all libraries which will be useful for implement logistics regression , spilte data into train , test and check the accuracy of the model.

Step 5-

ourmodel = LogisticRegression(C=0.01,solver='liblinear' )

initialize logistics regression with some random parameter.

C value is 0.01.

Or solver =”liblinear”.

Step -6

X_train, X_test, y_train, y_test = train_test_split(y, x, test_size=0.2, random_state=0)

Spilte dataset into train and test .

80 % for training or 20% for testing , random state =0.

Step 7-

ourmodel.fit(X_train, y_train)

Fit the dataset for training.

Ste

Step 8-

y_pred = ourmodel.predict(X_test) # here is prediction

Find prediction of our test dataset and then measure the accuracy of the model.

Step 9-

accuracy = metrics.accuracy_score(y_test, y_pred)print('Accuracy: {:.2f}'.format(accuracy))

Our model is giving 66% accuracy .which is not good.

So that our model performing worst.

How can improve performance of our model.

Now for improving model performance we will use hyper-parameter tuning on logistics regression .

For performing hyper-parameter tuning on logistics regression . we will use this time grid search.

step 10-

If you do not know about grid search click on this link-https://towardsdatascience.com/random-search-vs-grid-search-for-hyperparameter-optimization-345e1422899d

from sklearn.model_selection import GridSearchCVparam_grid = {'C': [  10, 100,1.0,0.1,0.01],'solver': ['newton-cg','lbfgs','liblinear'],'penalty': ['l2']}grid = GridSearchCV(LogisticRegression(), param_grid, refit = True, verbose = 3)grid.fit(X_train, y_train)

Importing grid search for searching best parameter for our model.

Initialize the grid search with -

In first parameter - write of your machine learning model name.Which you want to use.

In second parameter-pass the dictionary with different-2 parameter. The grid search will apply permutations and combination with different-2 parameter.then it give best combinations of parameters.

Set refit =true , for fit different-2 combination again and again.

after Initialize grid search . we are going to fit the training data in it.

Step 11-

print(grid.best_params_)

grid.best_params — give best parameters of our model.

It giving best parameter of the model.

once again , when we will initialize the model with these parameters .our model will give 83% accuracy.

--

--

--

Data Science , Machine Learning , BlockChain Developer

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Hierarchical Clustering: How to unite Matt’s musicians?

Tutorial Holdem Poker

Tutorial holdem poker online

Bellabeat Case Study

The Other Side

Fly Light In Fashion Demand Forecasting | Learn From Nature | Stylumia

Modeling Bunny&Wolf Population

New Udemy Course For The Enhanced Jira Query Tool

Microsoft Power BI: What’s Good and What’s Not So Good

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Puneet Singh

Puneet Singh

Data Science , Machine Learning , BlockChain Developer

More from Medium

Long Short Term Memory networks

Deep learning in 2022

Fast Classification and Clustering via Image Convolution Filters

Anomaly Detection using LSTM Autoencoder