Data Mining Quiz 2
Question 1
True or False: NoSQL databases provide greater performance at the expense of availability.
Select one:
- True
- False
The correct answer is 'False'.
Question 2
Which of the following is an example of a parametric approach.
Select one:
- a. KNN Classifier
- b. Bayes Classifier
- c. Linear Regression
- d. Principle Components Analysis
The correct answer is: Linear Regression
Question 3
True or False: Colinearity refers to a situation in which two or more predictor variables are closely related to each other.
Select one:
- True
- False
The correct answer is 'True'.
Question 4
The values of y and their corresponding values of y are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 3. Round to the nearest hundreds place.
b = 2.2 m = 0.9 y = 11.2 The correct answer is b =2.2, m =0.9, y =11.2
Question 5
Which command will provide descriptive statistics for the Boston data frame?
Select one:
- a. summary(Boston)
- b. eval(Boston)
- c. coef(Boston)
- d. stats(Boston)
The correct answer is: summary(Boston)
Question 6
True or False: The library() function lists all of the libraries that are loaded into memory within R.
Select one:
- True
- False
The correct answer is 'False'.
Question 7
The values of y and their corresponding values of y are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m(slope) and b(intercept) as well as the estimated value of y when the value of x is 10.
b = 4.48 m = 0.66 y = 6.48 The correct answer is b = 4.48, m = 0.66, y =6.48
Question 8
Residual plots are a useful tool for identifying:
Select one:
- a. Non-linearity
- b. Linearity
- c. Polynomial relationships
- d. Non-parametric relationships
The correct answer is: Non-linearity
Question 9
What R command could we use to generate a scatterplot diagram of our data to determine if it forms a linear pattern that would be suitable for linear regression or a non-linear pattern that would require some other technique?
Select one:
- a. plot()
- b. hist()
- c. matrix()
- d. summary()
The correct answer is: plot()
Question 10
True or False: Logistic regression can be used to predict a continuous variable.
Select one:
- True
- False
The correct answer is 'False'.
Question 11
True or False: The fix() function identifies values that contain data within a data frame that are inconsistent and automatically corrects these values.
Select one:
- True
- False
The correct answer is 'False'.
Question 12
A linear regression model is expressed as y ≈ β0+ β1x where β0 is the intercept and β1 is the slope of the line). The following equations can be used to compute the value of the coefficients β0 and β1.
Using the following set of data, find the coefficients β0 and β1rounded to the nearest thousandths place and the predicted value of y when x is 10.
β0 = 1.9 β1 = 1.7 y = 18.9 when x is 10 The correct answer is β0 = 1.9, β1 = 1.7, y = 18.9
Question 13
True or False: Linear regression is considered a non-parametric approach.
Select one:
- True
- False
The correct answer is 'False'.
Question 14
A farmer’s yield of corn is expressed as a linear regression model based upon the input variable which is the number of days of sunlight during the growing season. The model is express as Y = mX + b where Y is the estimated corn yield in bushels per acre, m= 1.38 and b = 42. Assuming that during the growing season it is predicted that there will be 67 days of sun, what will the corn yield be in bushels per acre?
______ bushels per acre
The correct answer is: 134.46
Question 15
True or False: In the KNN algorithm, a small value for K provides the most flexible fit (low bias/high variance).
Select one:
- True
- False
The correct answer is 'True'.