### Supervised Learning

Let’s look at the two task of supervised Learning

- Regression
- Classification

*SUPERVISED LEARNING = DATA SET CONTAINING TRAINING EXAMPLES WITH ASSOCIATED CORRECT LABELS*

Supervised Learning is a function that maps an input to an output based on example input-output pairs. It infers a function from labeled **training** data consisting of a set of **training** examples.

For example, learning to classify handwritten digits.

- Train your data set to such extent that it becomes easy to find out the result of unknown inputs.

### How supervised Learning Works?

let’s examine the problem of predicting annual income of person based on the number of years of higher education he/she has completed.

Inshort,

Build a model that approximates the relationship *f* between the number of years of higher education X and corresponding annual incomeY

#### Solution

There are two ways you can find a solution

**Engineering Solution**This solution would go by simple formula

income = ($5,000 * years_of_education) + baseline_income

**Learning Solution**Get information about some basic parameters like Degree Type, Years of work experience, School tiers etc. This will helps use to find the income easily.

For example: *“If they completed a Bachelor’s degree or higher, give the income estimate a 1.5x multiplier.”*

### Goal of supervised Learning

- Learns the relationship between income & education from
*scratch*, by running labeled training data through a learning algorithm. - This learned function can be used to estimate the income of people whose income Y is unknown, as long as we have years of education X as inputs.
- Inshort, we can apply our model to the
**unlabelled**test data to estimate

Predict Y as accurately as possible when given new examples where X is known and Y is unknown.

**Regression — predicting continuous values**

— Predicts a continuous target variable Y.

— Estimate a value, such as housing prices or human lifespan, based on input data X.

— Continuous means there aren’t gaps (discontinuities) in the value that Y can take on. Eg. A person’s weight and height are continuous values.

— Discrete Variables means it can only take on a finite number of values Eg. Number of kids somebody has is a discrete variable.

Predicting income is a classic regression problem. Your input data X includes all relevant information about individuals in the data set that can be used to predict income, such as years of education, years of work experience, job title, or zip code.

— These attributes are called **Features**, which can be numerical(e.g. years of work experience) or categorical (e.g. job title or field of study).

You’ll want as many training observations as possible relating these features to the target output Y, so that your model can learn the relationship *f* between X and Y.

### Linear regression(ordinary least squares)

We have our data set X*,* and corresponding target values Y. The goal of ordinary least squares (OLS) regression is to learn a linear model that we can use to predict a new *y* given a previously unseen *x* with as little error as possible. We want to guess how much income someone earns based on how many years of education they received.

*β0* is the y-intercept and *β1* is the slope of our line, i.e. how much income increases (or decreases) with one additional year of education.

Our goal is to learn the model parameters (in this case, *β0* and *β1*) that minimize error in the model’s predictions.

To find the best parameters:

*Define a cost function, or loss function, that measures how inaccurate our model’s predictions are.*

2. Find the parameters that minimize loss, i.e. make our model as accurate as possible.