Data Mining Quiz 4. True/False: Supervised learning features both input variables or attributes and an output or predicted variable.

by Obasi Oj - April 6, 2023, 2:55 p.m.0View 0Comments

Data Mining Quiz 4. True/False: Supervised Learning Features Both Input Variables Or Attributes And An Output Or Predicted Variable.

Question 1 True/False: Supervised learning features both input variables or attributes and an output or predicted variable. Select one:

True
False

The correct answer is 'True'.

Question 2

The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 10.

NOTE: You should consider the value x as the elapsed time. For 2005 this would be 0 years, for 2006 it would be 1 year and for 2012 it would be 7 years. What is the value of b?

The correct answer is: 11.6

Question 3

True/False: Shared nothing architectures distribute the processing of queries to access large volumes of data and provide near linear scalability in both storage volume and query performance.

Select one:

**True **
False

The correct answer is 'True'.

Question 4

The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 10.

NOTE: You should consider the value x as the elapsed time. For 2005 this would be 0 years, for 2006 it would be 1 year and for 2012 it would be 7 years.11.6 What is the value of m?

The correct answer is: 8.4

Question 5

The income of a company that produces disaster equipment has been expressed as a linear regression model based upon the input variable which is the number of hurricanes projected for the upcoming hurricane season. The model is express as Y = mX + b where Y is the estimated sales in millions of dollars, m = .67 and b = 8.2. Assuming that the weather service is predicting 12 hurricanes during the season what are the sales in millions of dollars expected to be?

The correct answer is: 16.24

Question 6

Assuming you have a linear model in which the value of m is .05 and the value of b is 10 that explains the relationship between income and credit extended. If income is 50,000, what credit will be extended?

Select one:

a. 500
b. 5010
c. 20508.4
d. 2510

The correct answer is: 2510

Question 7

The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b.

NOTE: You should consider the value x as the elapsed time. For 2005 this would be 0 years, for 2006 it would be 1 year and for 2012 it would be 7 years. What is the predicted value of y (in millions of dollars) when the year is 2012?

The correct answer is: 70.4

Question 8

True/False: The snowflake schema differs from the star schema in that the table holding the dimensional data are normalized.

Select one:

True
False

The correct answer is 'True'.

Question 9

True/False: Data Mining can be said to be a process designed to detect patterns in data sets.

Select one:

True
False

The correct answer is 'True'.

Question 10

True/False: According to our textbook, residual plots are a useful tool for identifying clusters.

Select one:

True
False

The correct answer is 'False'.

Question 11

True/False: A regression model has a R 2 statistic of .15. This indicates that the regression model is NOT a good fit and does a poor job of predicting the outcome based upon the input variables.

Select one:

True
False

The correct answer is 'True'.

Question 12

Assume that you have a data set which produces the following data plot. You wish to predict if a new case would be a ‘red’ case as opposed to a ‘blue’ case based upon the input attribute data. Which technique should you use?

Select one:

a. Linear Regression
b. Curvilinear Regression
c. Spline Regression
d. Logistic Regression

The correct answer is: Logistic Regression

Question 13 True/False: Reinforcement learning features elements of both supervised learning and unsupervised learning as the outcome variable or predicted values are validated over time and feedback is used to continuously train the learning algorithm. Select one:

True
False

The correct answer is 'True'.

Question 14

True or False: Qualitative variables are often referred to as categorical.

Select one:

True
False

The correct answer is 'True'.

Question 15

Which of the following is NOT a classification technique?

Select one:

a. Logistic regression
b. Linear discriminant analysis
c. K-nearest neighbors
d. Principle components analysis

The correct answer is: Principle components analysis

Question 16

True or False: Bayes theorem classifies cases by calculating the probability that the case belongs to each class and then selecting the one with the highest probability.

Select one:

True
False

The correct answer is 'True'.

Question 17

The value of K should typically be an odd number for what reason?

Select one:

a. It ensures that when classifying a solution there will not be a tie
b. It makes iterative process of the algorithm more efficient
c. It enables the algorithm to be implemented using recursion
d. None of these answers

The correct answer is: It ensures that when classifying a solution there will not be a tie

Question 18

Assuming K=1 how would the point X be classified using KNN?

Select one:

a. Red
b. Blue

The correct answer is: Blue

Question 19

Assuming K=3 how would the point X be classified using KNN?

Select one:

a. Red
b. Blue

The correct answer is: Red

Question 20

Assuming K=3, how would the point X be classified using KNN?

Select one:

a. Red
b. Blue

The correct answer is: Red

Question 21

Assuming K=5, how would the point X be classified using KNN?

Select one:

a. Red
b. Blue

The correct answer is: Red

Question 22

Assuming you have the following data values (4,6,9,20,8,7), what is the min-max normalized value for 6.

Where X is the set of data values and X v is the value to score. Provide your response rounded to the thousandths place: ___

The correct answer is: 0.125

Question 23

Assuming you have the following data values (3,6,9,14,2), what is the Z-Score normalized value for 5.

Where X is the set of data values and X v is the value to score. Provide your response rounded to the thousandths place: ___

The correct answer is: -0.37

Question 24

Assume that you are the data scientist for the GreatFoods! Supermarket chain. In an effort to increase sales of locally produced food such as eggs, milk, and bread, your manager asks you to develop a data mining solution that can identify the probability that a customer will purchase eggs when they purchase milk and vice versa. Which technique are you most likely to use?

Select one:

a. Linear Regression
b. K-nearest neighbor’s classification
c. Bayes Classifier
d. Hierarchical clustering

The correct answer is: Bayes Classifier

Tags:College Courses Data Mining Quizzes

by Oj Obasi - April 5, 2023, 3:14 a.m.

The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 10.

True/False: Shared nothing architectures distribute the processing of queries to access large volumes of data and provide near linear scalability in both storage volume and query performance.

The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 10.

Assuming you have a linear model in which the value of m is .05 and the value of b is 10 that explains the relationship between income and credit extended. If income is 50,000, what credit will be extended?

The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b.

True/False: The snowflake schema differs from the star schema in that the table holding the dimensional data are normalized.

True/False: Data Mining can be said to be a process designed to detect patterns in data sets.

True/False: According to our textbook, residual plots are a useful tool for identifying clusters.

True/False: A regression model has a R 2 statistic of .15. This indicates that the regression model is NOT a good fit and does a poor job of predicting the outcome based upon the input variables.

Assume that you have a data set which produces the following data plot. You wish to predict if a new case would be a ‘red’ case as opposed to a ‘blue’ case based upon the input attribute data. Which technique should you use?

True or False: Qualitative variables are often referred to as categorical.

Which of the following is NOT a classification technique?

True or False: Bayes theorem classifies cases by calculating the probability that the case belongs to each class and then selecting the one with the highest probability.

The value of K should typically be an odd number for what reason?

Assuming K=1 how would the point X be classified using KNN?

Assuming K=3 how would the point X be classified using KNN?

Assuming K=3, how would the point X be classified using KNN?

Assuming K=5, how would the point X be classified using KNN?

Assuming you have the following data values (4,6,9,20,8,7), what is the min-max normalized value for 6.

Assuming you have the following data values (3,6,9,14,2), what is the Z-Score normalized value for 5.

Related Posts

0 Comments

Recent Posts

Browse Tags

Calendar

More Recent Posts