Data Mining Quiz 1
Question 1
True or False: Data Mining can be said to be a process designed to detect patterns in data sets.
Select one:
- True
- False
The correct answer is 'True'.
Question 2
True or False: In unsupervised learning, the learning algorithm must be trained using data attributes that have been paired with an outcome variable.
Select one:
- True
- False
The correct answer is 'False'.
Question 3
True or False: Unsupervised learning involves building a statistical model for predicting, or estimating an output based upon one or more inputs.
Select one:
- True
- False
The correct answer is 'False'.
Question 4
Regression analysis involves developing a model where one or more inputs are used to predict an output variable. Regression, in this context, represents what kind of learning.
Select one:
- a. Reinforcement learning
- b. Supervised learning
- c. Unsupervised learning
- d. Hybrid Learning
The correct answer is: Supervised learning
Question 5
Assuming that we have a data set that includes sales data for every customer over the course of several years and we wanted to use this data to predict future sales which would be the most appropriate technique to investigate?
Select one:
- a. Classification
- b. Regression
- c. Clustering
- d. Decision Trees
The correct answer is: Regression
Question 6
Assume that you had a variety of data including medical history, diet, heredity factors on individuals who developed cancer and you wanted to use this data to determine whether a person is likely to develop cancer. Which technique would be the most promising to start with?
Select one:
- a. Classification
- b. Regression
- c. Clustering
- d. Estimation
The correct answer is: Classification
Question 7
Which of the following is an example of an unsupervised learning algorithm?
Select one:
- a. Linear Regression
- b. ID3 Decision Tree
- c. K-Means
- d. K-Nearest Neighbors
The correct answer is: K-Means
Question 8
True or False: A predication outcome variable must be categorical?
Select one:
- True
- False
The correct answer is 'False'.
Question 9
Which of the following is NOT a machine learning technique?
Select one:
- a. Regression
- b. Clustering
- c. Linear Components Analytics
- d. Neural Networks
The correct answer is: Linear Components Analytics
Question 10
True or False: In a supervised learning model, Bias refers to the error that is introduced from the assumptions of the data analyst.
Select one:
- True
- False
The correct answer is 'False'.
Question 11
The objective of ______ is to identify valid novel and potentially useful, and understandable correlations and patterns in existing data.
The correct answer is: data mining
Question 12
Which of the following is an example of a NOSQL Analytics database?
Select one:
- a. IBM DB2
- b. Oracle
- c. Cassandra
- d. Greenplum
The correct answer is: Cassandra
Question 13
What does ETL stand for?
The correct answer is: Extract transform load
Question 14
True or False: In a data warehouse, unidimensional data is stored in a star schema format.
Select one:
- True
- False
The correct answer is 'False'.
Question 15
The term OLAP stands for?
Select one:
- a. Online Applications Processing
- b. Online Analytical Processing
- c. Online Transactional Processing
- d. Online Limited Analytics Processing
The correct answer is: Online Analytical Processing
Question 16
A database where all of the values for a particular column are stored contiguously is called?
Select one:
- a. Column-oriented storage
- b. In memory database
- c. Partitioning
- d. Data Compression
The correct answer is: Column-oriented storage
Question 17
True or False: The snowflake schema differs from the star schema in that the table holding the dimensional data are normalized.
Select one:
- True
- False
The correct answer is 'True'.
Question 18
True or False: Map/Reduce refers to an optimized approach to process SQL queries.
Select one:
- True
- False
The correct answer is 'False'.
Question 19
True or False: Information Retrieval or text analytics is NOT a form of data mining.
Select one:
- True
- False The correct answer is 'False'.
Question 20
Which of the following is NOT a statistical processing software package?
Select one:
- a. SAS
- b. Minitab
- c. Vertica
- d. Mahout
The correct answer is: Vertica