Home »
Data Science
Feature Engineering | Data Science
Feature Engineering in Data Science: In this tutorial, we are going to learn about Feature Engineering, steps to try to feature engineering, etc.
Submitted by Kartiki Malik, on March 17, 2020
Feature Engineering
Feature engineering is the method of mistreatment domain information of the information to make options that make machine learning algorithms work. If feature engineering is finished properly, it will increase the prognosticative power of machine learning algorithms by making options from information that facilitate the machine learning method.
Steps that are concerned whereas resolution any downside in machine learning are as follows:
- Gathering Data.
- Cleaning Data.
- Feature Data.
- Defining Data.
- Training, while the testing model and predicting the output.
Feature engineering is that the most vital art in machine learning that creates an enormous distinction between a decent model and a foul model. Feature engineering covers a few things:
Let's assume that we have a data "flight date-time vs status". Then, given the date-time data we've got to predict the standing of the flight.
As the standing of the flight depends on the hour of the day, not on the date-time. We'll produce the new feature "Hour_Of_Day". mistreatment the "Hour_Of_Day" feature, the machine can learn higher as this feature is directly associated with the standing of the flight.
Now, making the new feature as "Hour_Of_Day" is an example of the feature engineering.
Let's see another example. Suppose we have a tendency to are given the latitude, meridian and different data with the given label "Price_Of_House". We want to predict the value of the house therein space. The latitude and meridian don't seem to be of any use if they're alone. Now we need to use the crossed column feature engineering. We'll mix the latitude and therefore the meridian to create one feature. Combining into one feature can facilitate the model to learn higher.
Here, combining 2 options to make one helpful feature is that the feature engineering.
Sometimes, we have a tendency to use the bucketed column feature engineering. Suppose we have a tendency to are given a knowledge during which one column is that the age and therefore the output is the classification(X, Y, Z). By seeing the information, we have a tendency to realize that the output(X, Y, Z) depends on the age-group like 11–20 years of age-group output to X, 21–40 years output to Y, 41–70 years output to Z. Here, We'll create three buckets for the age-group 11–20, 21–40 and 41–70. We'll produce the new feature that is that the bucketed column "Age_Range" having the numerical values one, a pair of and three wherever one is mapped to the bucket 1, a pair of is mapped to the bucket 2 and three is mapped to the bucket 3.
Here, making the Age_Range bucket is that the feature engineering.
Sometimes, removing the unwanted feature is additionally feature engineering. Because the feature that isn't connected degrade the performance of the model.
Now, a few steps to try to feature engineering are here,
- Analyze options.
- Produce options.
Check however the options work with the model.
Start once more from 1st until the options work absolutely.
This is what we have a tendency to liquidate feature engineering
Last however not least, automatic Feature Engineering is the current hot topic. However, it needs plenty of resources. Few firms have already started performing on it.