Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.

Last Updated on October 22, 2021 by Admin 3

HOTSPOT

You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).

The remaining 1,000 rows represent class 1 (10 percent).

The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.

You need to configure the module.

Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q12 109 Question
DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q12 109 Question
DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q12 109 Answer
DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q12 109 Answer

Explanation:

Box 1: 300
You type 300 (%), the module triples the percentage of minority cases (3000) compared to the original dataset (1000).

Box 2: 5
We should use 5 data rows.
Use the Number of nearest neighbors option to determine the size of the feature space that the SMOTE algorithm uses when in building new cases. A nearest neighbor is a row of data (a case) that is very similar to some target case. The distance between any two cases is measured by combining the weighted vectors of all features.

By increasing the number of nearest neighbors, you get features from more cases.
By keeping the number of nearest neighbors low, you use features that are more like those in the original sample.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments