Big Data Practice College Quiz 2 Questions
-
Key-Value Storage, Document Storage, Graph Storage are types of:
- a.NoSQL
- b.DB2
- c.Oracle
- d.SQL Server
-
Hadoop was started with Doug Cutting and Mike Cafarella in the year __ when they both started to work on ___ project.
- a.2002, Apache Nutch
- b.2001, Google NFS
- c.2000, Hadoop MapReduce
- d.2003, Hadoop YARN
-
Name the default engine used in new deployments of MongoDB for versions 3.2 or higher.
- a.SQL DBMS
- b.Snowflake
- c.WiredTiger
- d.MMAPv1
-
Your boss asks you about Hadoop Distributed File System, wanting to know how you would expand the storage capacity of your current Hadoop environment. You tell him you can:
- a.Add more servers to expand the storage capabilities.
- b.Start a new instance of Hadoop to handle the overflow.
- c.You canʼt add more space given the limitations of the program
- d.Add an additional instance of Hadoop to your cluster configuration
-
You are presented with the following data structure:
1 2 3 4 5 6 7 8
{ _id: <User1>, username: “JohnDoe”, firstname: “John”, lastname: “Doe”, age: 20, groups: [“ polotics”, “news” ] }
What type of database allows this sort of structure to be stored and retrieved using a non-structured querying language?
- a.MongoDB
- b.RDBMS
- c.OLAP
- d.Oracle
-
A four-step approach which is used to build or design data products:
- a.Big Data Principle Approach
- b.Drivetrain Approach
- c.Predict-and-optimise framework
- d.CRISP-DM
-
______ is a familiar example of a data product based on well-built predictive models that do not achieve an optimal objective.
Select one
- a.Recommendation engine
- b.Predictive Revenue Modeling
- c.Python data products
- d.Big data forecasting
Intro to Artificial Intelligence Practice Questions 1
-
Which of the following is NOT a way that AI learns?
Select One:
- Intuitive learning
- Reinforcement learning
- Unsupervised learning
- Supervised learning
-
Which of the following are applications of Artificial Intelligence in action?
A. IBM Watson utilizing its information retrieval capabilities to provide technical information to oil and gas company workers.
B. Watson analyzing Grammy nominated song lyrics over a 60-year period and categorizing them based on their emotions.
C. Assisting patients with neurological damage by detecting patterns in massive movement related datasets and using robots to trigger specific movements in the human body to create new neural pathways in the brain.
D. Law enforcement authorities using facial recognition algorithms to identify suspects in multiple streams of video footage
Select One
- Only option A is correct
- Only options A, B, and C are correct
- None of the options are correct
- All of the options are correct
-
Which of the following aspects involved in converting the stethoscope into a digital device to support patient diagnoses involves the use of AI?
- An app on the mobile device that applies learnings from previous diagnosis data to assist the physicians in their current diagnoses
- Sending digital signals to a mobile device with a machine learning app via bluetooth
- Graphing heart beat data on the mobile device allowing a physician to spot trends
- Inserting a digitizer into the stethoscope tube to convert the analog sound of the heart beat into a digital signal
-
Which of these is currently NOT an application of Collaborative Robots or Cobots?
Select One:
- Robots assisting or replacing humans in jobs that may be dull, dangerous, ineffective or inefficient when done by humans
- Robots helping humans lift heavy containers
- Personal use in the home such as doing the laundry and cooking for example
- Robots helping move items on shelves for stocking purposes
-
Advances in the field of Computer Vision make which of the following possible?
Select One:
- Detecting fraudulent transactions
- Detecting cancerous moles in skin images
- On-demand online tutors
- Real-time transcription
-
Natural Language AI algorithms that learn by example are the reason we can talk to machines and they can talk back to us.
Select One:
- True
- False
-
Which of these is NOT a current application of AI?
Select one:
- Self-Driving vehicles utilizing Computer Vision to navigate around objects
- Collaborative Robots helping humans lift heavy containers
- Making precise patient diagnosis and prescribing independent treatment
- Classifying rock samples to identify best places to drill for oil
-
AI is the fusion of many fields of study. Which of these fields, along with Computer Science, plays a role in the application of AI?
- Philosophy
- All responses are correct
- Statistics
- Mathematics
-
Which of the following is an attribute of Strong or Generalized AI?
- Cannot teach itself new strategies
- Perform independent tasks
- Can perform specific tasks, but cannot learn new ones
- Operate with human-level consciousness
-
Which of the following is NOT a good way to define AI?
Select One or Multiple:
- AI is the application of computing to solve problems in an intelligent way using algorithms.
- AI is the use of algorithms that enable computers to find patterns without humans having to hard code them manually
- AI is all about machines replacing human intelligence.
- AI is Augmented Intelligence and is not intended to replace human intelligence rather extend human capabilities
Big Data Practice College Quiz Questions
-
With big data, you can analyze and assess production, customer feedback and returns, and other factors to reduce outages and anticipate future demands. This statement is true for which of the following business activities?
- a.Product development
- b.Operational efficiency
- c.Machine learning
- d.Drive innovation
-
Mobile devices not only give the possibility to analyze behavioral data (such as clicks and search queries) but also give the possibility to store and analyze location-based data (GPS data). This is true in what phase of big data.
- a.Phase 1
- b.Phase 2
- c.Phase 3
- d.Phase 4
-
Big Data addresses the following business activities except:
- a.Product development
- b.Customer experience
- c.Production enhancements
- d.Machine learning
-
Which countryʼs data accuracy principle guidelines are: Information shall be sufficiently accurate, complete, and up to date to minimize the possibility that inappropriate information may be used to make a decision about the individual.
- a.United States of America
- b.United Kingdom
- c.Canada
- d.Mexico
-
Robert develops strategies for analyzing data, preparing data for analysis, exploring, analyzing, and visualizing data. Can you guess what Robertʼs job is?
- a.Data Scientist
- b.Database Engineer
- c.Database Developer
- d.Data Miner
-
The three different varieties of big data include structured, semi-structured, and unstructured data.
- a.True
- b.False
-
Getting started with big data involves these three key actions: Integrate, manage and analyze.
- a.True
- b.False
-
Which items below are the basic three elements of big data?
- a.Variety, Volume, and Variations
- b.Volume, Velocity, and Variety
- c.Variations, Vicinity, and Volume
- d.Vicinity, Variety, and Velocity
-
Big Data starts with large-volume, heterogeneous, independent sources with dispersed and also decentralized control.
- a.True
- b.False
-
A key challenge of big data research is to __ and ____ value reference layers of big data.
- a.justify, develop
- b.clarify, assign
- c.rate, create
- d.justify, assign
JAVA Programming College Quiz 1 Questions
-
What is output by the following Java program?
1 2 3 4 5 6 7
class Compute { static int compute() { return 42; } static int compute(int i) { return i+1; } public static void main(String[] args) { System.out.println(compute(compute(0))); } }
Select one: a. 1 b. 42 c. 2 d. 0 e. 43
-
Consider the following Java declaration and assignment statement.
float x = y;
Which one of the following types is "y" NOT allowed to be? Select one: a. double b. long c. int d. short e. float -
Which of the following is NOT an effective strategy when your program does not work?
Select one:
- a. Make random changes to code that you do not unders tand until it accidentally works.
- b. Add debugging statements to output information ab out the state of your program while it runs.
- c. Check each error message generated by the compiler or IDE.
- d. Use a debugger to pause your program while it is running so you can check its state.
- e. Read through your code and figure out what it does step by step.
-
Which of the following keywords is useful for loops that should always execute at least once?
Select one:
- a. switch
- b. while
- c. continue
- d. break
- e. do
-
Which of the following can a class NOT be used for?
Select one:
- a. a container for static methods (subroutines)
- b. a container for static variables
- c. a primitive type
- d. a type for method parameters
- e. a type for variables
-
Consider the following Java method, which term best describes "static"?
1 2 3
public static void main(String[] args) { System.out.println("Hello, World!"); }
Select one:
- a. actual parameter or argument
- b. method call
- c. formal parameter
- d. modifier
- e. return type
-
Consider the following Java program:
1 2 3 4 5 6
public class HelloWorld { // My first program! public static void main(String[] args) { System.out.println("Hello, World!"); } }
What starts on line 1? Select one:
- a. a comment
- b. a class definition
- c. a variable declaration
- d. a statement
- e. a method (subroutine) definition
-
Consider the following Java program:
1 2 3 4 5 6
public class HelloWorld { // My first program! public static void main(String[] args) { System.out.println("Hello, World!"); } }
What is on line 3? Select one:
- a. a class definition
- b. a method (subroutine) definition
- c. a statement
- d. a variable declaration
- e. a comment
-
In a for loop, how many times does the update run?
Select one:
- a. At least once, at the end of each iteration.
- b. Zero or more times, at the end of each iteration .
- c. Zero or more times, at the beginning of each iteration.
- d. Exactly once.
- e. At least once, at the beginning of each iteration.
-
Assume "test" is a boolean variable. Which of the following expressions is equival ent to "test == false"?
Select one:
a. test b. !test c. test.equals(true) d. test = true
-
What is the output of the following Java program?
1 2 3 4 5 6 7 8
class Sum { static int sum = 0; static void add(int i) { i++; } public static void main(String[] args) { for (int i = 0; i < 10; i++) add(sum); System.out.println(sum); } }
Select one:
- a. 100
- b. 9
- c. 10
- d. 0
- e. 45
-
Each of the individual tasks that a CPU is working on is called:
Select one:
- a. a message
- b. a program counter
- c. a thread
- d. an object
- e. an address
-
Which one of the following is used in Java programming to handle asynchronous events ?
Select one:
- a. protocols
- b. pragmatics
- c. reserved words
- d. short circuits
- e. event handlers
-
In a for loop, how many times does the continuation condition run?
Select one:
- a. Exactly once.
- b. At least once, at the end of each iteration.
- c. Zero or more times, at the beginning of each iteration.
- d. At least once, at the beginning of each iteration.
- e. Zero or more times, at the end of each iteration .
-
Incorrect
Consider the following line of Java code.
System.out.println("Hello, World!");
"Hello ,World" is which of the following? Select one:a. a class b. a method (subroutine) c. an object d. a parameter e. a statement
-
Consider the following Java program:
1 2 3 4 5 6
public class HelloWorld { // My first program! public static void main(String[] args) { System.out.println("Hello, World!"); } }
What starts on line 4? Select one:
- a. a comment
- b. a method (subroutine) definition
- c. a class definition
- d. a statement
- e. a variable declaration
-
Which of the following types is NOT a primitive type?
Select one:
- a. short
- b. boolean
- c. String
- d. char
- e. double
-
Consider the following class definition. Which variables can be used in the mi ssing "println" expression on line 20 ?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
public class PrintStuff { public static void main() { { int i = -1; System.out.println(_____); } int j = 1; for (j = 0; j < 10; j++) { System.out.println(_____); } { int k; for (k = 0; k < 10; k++) { System.out.println(_____); } } System.out.println(_____); } }
Select one:
- a. Only "j"
- b. Only "i"
- c. "i" and "j"
- d. "j" and "k"
- e. Only "k"
-
Consider the following block of Java code. How many times will it output "Hello"?
1 2 3 4
for (int i = 1; i < 10; i--) { System.out.println("Hello"); }
Select one:
- a. 0
- b. 9
- c. 1
- d. 10
- e. Way too many!
-
Which of the following keywords is useful for skipping to the next iteration of a loop?
Select one:
- a. while
- b. break
- c. continue
- d. do
- e. switch
Data Mining Quiz 4. True/False: Supervised learning features both input variables or attributes and an output or predicted variable.
-
Assume that you are the data scientist for the GreatFoods! Supermarket chain. In an effort to increase sales of locally produced food such as eggs, milk, and bread, your manager asks you to develop a data mining solution that can identify the probability that a customer will purchase eggs when they purchase milk and vice versa. Which technique are you most likely to use?
Select one:
- a. Linear Regression
- b. K-nearest neighbor’s classification
- c. Bayes Classifier
- d. Hierarchical clustering
-
Assuming you have the following data values (3,6,9,14,2), what is the Z-Score normalized value for 5.
Where X is the set of data values and X v is the value to score. Provide your response rounded to the thousandths place: ___
-
Assuming you have the following data values (4,6,9,20,8,7), what is the min-max normalized value for 6.
Where X is the set of data values and X v is the value to score. Provide your response rounded to the thousandths place: ___
-
Assuming K=5, how would the point X be classified using KNN?
Select one:
- a. Red
- b. Blue
-
Assuming K=3, how would the point X be classified using KNN?
Select one:
- a. Red
- b. Blue
-
Assuming K=3 how would the point X be classified using KNN?
Select one:
- a. Red
- b. Blue
-
Assuming K=1 how would the point X be classified using KNN?
Select one:
- a. Red
- b. Blue
-
The value of K should typically be an odd number for what reason?
Select one:
- a. It ensures that when classifying a solution there will not be a tie
- b. It makes iterative process of the algorithm more efficient
- c. It enables the algorithm to be implemented using recursion
- d. None of these answers
-
True or False: Bayes theorem classifies cases by calculating the probability that the case belongs to each class and then selecting the one with the highest probability.
Select one:
- True
- False
-
Which of the following is NOT a classification technique?
Select one:
- a. Logistic regression
- b. Linear discriminant analysis
- c. K-nearest neighbors
- d. Principle components analysis
-
True or False: Qualitative variables are often referred to as categorical.
Select one:
- True
- False
-
True/False: Reinforcement learning features elements of both supervised learning and unsupervised learning as the outcome variable or predicted values are validated over time and feedback is used to continuously train the learning algorithm. Select one:
- True
- False
-
Assume that you have a data set which produces the following data plot. You wish to predict if a new case would be a ‘red’ case as opposed to a ‘blue’ case based upon the input attribute data. Which technique should you use?
Select one:
- a. Linear Regression
- b. Curvilinear Regression
- c. Spline Regression
- d. Logistic Regression
-
True/False: A regression model has a R 2 statistic of .15. This indicates that the regression model is NOT a good fit and does a poor job of predicting the outcome based upon the input variables.
Select one:
- True
- False
-
True/False: According to our textbook, residual plots are a useful tool for identifying clusters.
Select one:
- True
- False
-
True/False: Data Mining can be said to be a process designed to detect patterns in data sets.
Select one:
- True
- False
-
True/False: The snowflake schema differs from the star schema in that the table holding the dimensional data are normalized.
Select one:
- True
- False
-
The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b.
NOTE: You should consider the value x as the elapsed time. For 2005 this would be 0 years, for 2006 it would be 1 year and for 2012 it would be 7 years. What is the predicted value of y (in millions of dollars) when the year is 2012?
-
Assuming you have a linear model in which the value of m is .05 and the value of b is 10 that explains the relationship between income and credit extended. If income is 50,000, what credit will be extended?
Select one:
- a. 500
- b. 5010
- c. 20508.4
- d. 2510
-
The income of a company that produces disaster equipment has been expressed as a linear regression model based upon the input variable which is the number of hurricanes projected for the upcoming hurricane season. The model is express as Y = mX + b where Y is the estimated sales in millions of dollars, m = .67 and b = 8.2. Assuming that the weather service is predicting 12 hurricanes during the season what are the sales in millions of dollars expected to be?
-
The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 10.
NOTE: You should consider the value x as the elapsed time. For 2005 this would be 0 years, for 2006 it would be 1 year and for 2012 it would be 7 years.11.6 What is the value of m?
-
True/False: Shared nothing architectures distribute the processing of queries to access large volumes of data and provide near linear scalability in both storage volume and query performance.
Select one:
- True
- False
-
The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 10.
NOTE: You should consider the value x as the elapsed time. For 2005 this would be 0 years, for 2006 it would be 1 year and for 2012 it would be 7 years. What is the value of b?
-
True/False: Supervised learning features both input variables or attributes and an output or predicted variable. Select one:
- True
- False
Data mining Quiz 3
-
A linear regression model is expressed as y ≈ β0+ β1x where β0 is the intercept and β1 is the slope of the line). The following equations can be used to compute the value of the coefficients β 0 and β1.
Using the following set of data, find the coefficients β 0 and β1rounded to the nearest thousandths place and the predicted value of y when x is 10. {(-1 , 0),(0 , 2),(1 , 4),(2 , 5)} What is the value of β0.
-
Assume that you had a variety of data including medical history, diet, heredity factors on individuals who developed cancer and you wanted to use this data to determine whether a person is likely to develop cancer. Which technique would be the most promising to start with?
Select one:
- a. Classification
- b. Regression
- c. Clustering
- d. Estimation
-
A linear regression model is expressed as y ≈ β0+ β1x where β0 is the intercept and β1 is the slope of the line). The following equations can be used to compute the value of the coefficients β 0 and β1.
Using the following set of data, find the coefficients β 0 and β1 rounded to the nearest thousandths place and the predicted value of y when x is 10. {(-1 , 0),(0 , 2),(1 , 4),(2 , 5)} What is the value of β1.
-
A linear regression model is expressed as y ≈ β0+ β1x where β0 is the intercept and β1 is the slope of the line). The following equations can be used to compute the value of the coefficients β 0 and β1.
Using the following set of data, find the coefficients β 0 and β1 rounded to the nearest thousandths place and the predicted value of y when x is 10. {(-1 , 0),(0 , 2),(1 , 4),(2 , 5)} What is the value of y.
-
The following diagram represents which technique?
Select one:
- a. Linear Regression
- b. Curvilinear Regression
- c. Spline Regression
- d. Polynomial curve fitting
-
Which of the following statements will generate a multiple linear regression model within R where the output or predicted variables is Sales and the prediction variables include temperature and unemploymentrate?
Select one:
- a. lm(sales~temperature+unemploymentrate)
- b. lm(temperature+unemploymentrate=sales)
- c. lm(sales+temperature~unemploymentrate)
- d. None of these commands are valid
-
When using a relational database engine as the backend for analytics processing, the acronym ______ is used to describe it.
Select one:
- a. MOLAP
- b. ROLAP
- c. OLAP
- d. RDBMS
-
True/False: A linear regression model can be used to predict categorical data values.
Select one:
- True
- False
-
When data observations are placed into specific groups according to their observed characteristics this is known as: __
Select one:
- a. Classification
- b. Decision Tree Analysis
- c. Clustering
- d. Regression
-
The names() function within R:
Select one:
- a. Lists all of the column names in the data frame provided as an argument to the function.
- b. Attaches the names to make the variables in the data frame available by name.
- c. Displays the names of the classes identified by the K means clustering algorithm.
- d. None of these answers
-
True or False: Residual plots are a useful tool for identifying non-linearity.
Select one:
- True
- False
-
You have a dataset which produces the following plot and you need to create a predictive model. Which of the following techniques are you most likely to use?
Select one:
- a. Linear Regression
- b. Curvilinear Regression
- c. K-Nearest Neighbors
- d. Logistic Regression
-
Which of the following functions is used to generate a linear regression model within R?
Select one:
- a. lredict()
- b. lm()
- c. lstat()
- d. glm()
-
True or False: The following data plot represents data that is linearly separable
Select one:
- True
- False
-
The income of a company that produces disaster equipment has been expressed as a linear regression model based upon the input variable which is the number of hurricanes projected for the upcoming hurricane season. The model is express as Y = mX + b where Y is the estimated sales in millions of dollars, m = .76 and b = 5. Assuming that the weather service is predicting 6 hurricanes during the season what are the sales in millions of dollars expected to be?
______ million dollars
Data mining quiz 2
-
True or False: In the KNN algorithm, a small value for K provides the most flexible fit (low bias/high variance).
Select one:
- True
- False
-
A farmer’s yield of corn is expressed as a linear regression model based upon the input variable which is the number of days of sunlight during the growing season. The model is express as Y = mX + b where Y is the estimated corn yield in bushels per acre, m= 1.38 and b = 42. Assuming that during the growing season it is predicted that there will be 67 days of sun, what will the corn yield be in bushels per acre?
______ bushels per acre
-
True or False: Linear regression is considered a non-parametric approach.
Select one:
- True
- False
-
A linear regression model is expressed as y ≈ β0+ β1x where β0 is the intercept and β1 is the slope of the line). The following equations can be used to compute the value of the coefficients β0 and β1.
Using the following set of data, find the coefficients β0 and β1rounded to the nearest thousandths place and the predicted value of y when x is 10.
β0 = 1.9 β1 = 1.7 y = 18.9 when x is 10
-
True or False: The fix() function identifies values that contain data within a data frame that are inconsistent and automatically corrects these values.
Select one:
- True
- False
-
True or False: Logistic regression can be used to predict a continuous variable.
Select one:
- True
- False
-
What R command could we use to generate a scatterplot diagram of our data to determine if it forms a linear pattern that would be suitable for linear regression or a non-linear pattern that would require some other technique?
Select one:
- a. plot()
- b. hist()
- c. matrix()
- d. summary()
-
Residual plots are a useful tool for identifying:
Select one:
- a. Non-linearity
- b. Linearity
- c. Polynomial relationships
- d. Non-parametric relationships
-
The values of y and their corresponding values of y are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m(slope) and b(intercept) as well as the estimated value of y when the value of x is 10.
b = 4.48 m = 0.66 y = 6.48
-
True or False: The library() function lists all of the libraries that are loaded into memory within R.
Select one:
- True
- False
-
Which command will provide descriptive statistics for the Boston data frame?
Select one:
- a. summary(Boston)
- b. eval(Boston)
- c. coef(Boston)
- d. stats(Boston)
-
The values of y and their corresponding values of y are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 3. Round to the nearest hundreds place.
b = 2.2 m = 0.9 y = 11.2
-
True or False: Colinearity refers to a situation in which two or more predictor variables are closely related to each other.
Select one:
- True
- False
-
Which of the following is an example of a parametric approach.
Select one:
- a. KNN Classifier
- b. Bayes Classifier
- c. Linear Regression
- d. Principle Components Analysis