Diabetes Prediction using Machine Learning from Kaggle

Share it with your friends Like

Thanks! Share it with your friends!


In this video we will understand how we can implement Diabetes Prediction using Machine Learning. The dataset is taken from Kaggle.

Please subscribe and support the channel.

github url: https://github.com/krishnaik06/Diabetes-Prediction

Data Science Projects playlist: https://www.youtube.com/watch?v=5Txi0nHIe0o&list=PLZoTAELRMXVNUcr7osiU7CCm8hcaqSzGw

NLP playlist: https://www.youtube.com/watch?v=6ZVf1jnEKGI&list=PLZoTAELRMXVMdJ5sqbCK2LiM0HhQVWNzm

Statistics Playlist: https://www.youtube.com/watch?v=GGZfVeZs_v4&list=PLZoTAELRMXVMhVyr3Ri9IQ-t5QPBtxzJO

Feature Engineering playlist: https://www.youtube.com/watch?v=NgoLMsaZ4HU&list=PLZoTAELRMXVPwYGE2PXD3x0bfKnR0cJjN

Computer Vision playlist: https://www.youtube.com/watch?v=mT34_yu5pbg&list=PLZoTAELRMXVOIBRx0andphYJ7iakSg3Lk

Data Science Interview Question playlist: https://www.youtube.com/watch?v=820Qr4BH0YM&list=PLZoTAELRMXVPkl7oRvzyNnyj1HS4wt2K-

You can buy my book on Finance with Machine Learning and Deep Learning from the below url

amazon url: https://www.amazon.in/Hands-Python-Finance-implementing-strategies/dp/1789346371/ref=sr_1_1?keywords=krish+naik&qid=1560943725&s=gateway&sr=8-1


louer le seigneur says:

Thanks Buddy

Binod Pratap Singh says:

Hey Krish, Why my accuracy is better than your's even though I used the same code line by line. 13:08
Score on Imputed Data

[[129 15]

[ 33 54]]


Binod Pratap Singh says:

With Random Forest I got 74% Accuracy compared to ur 71%. I used hyperparameter with grid search.
how to know we have reached the peak of CPU performance i.e we can't predict more correctly than our own result

RandomForestClassifier(max_depth=25, max_leaf_nodes=5,

min_samples_leaf=2, n_estimators=10))])

The mean accuracy of the model is: 0.7402597402597403

Susanth Susanth says:

Can i use it for my project? (Probability Paper)

Ayoush Das says:

how to differentiate between diabetes type from pima india dataset?

Zaveria Mutwalli says:

Why you choosed random forest algorithm for this?. Actually i am new to this so i want to know reason of using this algorithm as compared to other algorithm

Pawan Jakke says:

when I am running this code:-
import seaborn as sns

import matplotlib.pyplot as plt

#get correlations of each features in dataset

corrmat = data.corr()

top_corr_features = corrmat.index


#plot heat map


I am getting error as:-
OSError Traceback (most recent call last)

<ipython-input-5-dfccaed977f4> in <module>

—-> 1 import seaborn as sns

2 import matplotlib.pyplot as plt


4 #get correlations of each features in dataset

5 corrmat = data.corr()

E:Anacondalibsite-packagesseaborn__init__.py in <module>

1 # Import seaborn objects

—-> 2 from .rcmod import * # noqa: F401,F403

3 from .utils import * # noqa: F401,F403

4 from .palettes import * # noqa: F401,F403

5 from .relational import * # noqa: F401,F403

E:Anacondalibsite-packagesseabornrcmod.py in <module>

5 import matplotlib as mpl

6 from cycler import cycler

—-> 7 from . import palettes



E:Anacondalibsite-packagesseabornpalettes.py in <module>

7 from .external import husl


—-> 9 from .utils import desaturate, get_color_cycle

10 from .colors import xkcd_rgb, crayons


E:Anacondalibsite-packagesseabornutils.py in <module>


9 import numpy as np

—> 10 from scipy import stats

11 import pandas as pd

12 import matplotlib as mpl

~AppDataRoamingPythonPython38site-packagesscipy__init__.py in <module>


105 # Allow distributors to run custom init code

–> 106 from . import _distributor_init


108 _all_ += num.__all_

~AppDataRoamingPythonPython38site-packagesscipy_distributor_init.py in <module>

24 if os.path.isdir(libs_path):

25 for filename in glob.glob(os.path.join(libs_path, '*dll')):

—> 26 WinDLL(os.path.abspath(filename))

E:Anacondalibctypes__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error, winmode)


380 if handle is None:

–> 381 self._handle = _dlopen(self._name, mode)

382 else:

383 self._handle = handle

OSError: [WinError 193] %1 is not a valid Win32 application

Kindly help me

Tuna Özateş says:

For those who are watching in 2021, use "from sklearn.impute import SimpleImputer"

instead of from "sklearn.preprocessing import Imputer"

Orator Adda says:

I have my own data set with approx 15 parameters , and want to write a paper but not aware with data science , please suggest ….

Niraj Chaudhari says:

I used logistic regression for this I got an accuracy:0.76663 and it does not take time for fitting

Shervin the prodigy says:

I have a doubt, there is the independent variable named skin in his video but the actual dataset does not contain that independent variable, why is that? Am I looking at the wrong dataset?

dc09kaa says:

How to find diabetes prediction function?


4:17 I think if the label or class is in the form of a number it is called regression. is that so sir?

ganesh basalel says:

could you please teach about Hadoop map reduce k-means clustering (H-KC)


Requesting to create few more projects in Healthcare domain.

Code2Create - C2C says:

Hi sir I am Likhitha
currently 11 years
I am getting a error that says "No module named 'xgboost'" please do help me out to solve this error
thank you

dilip yadav says:

Sir can our give any reason why we have used xgboost algo for improving accuracy why not another algo

Mayur Johri says:

Can you make one video related to handle a high cardinality in a feature

Ritesh Jain says:

Is their anyone who import SimpleImputer instead of Imputer?

Write a comment


Area 51