Free E20-007 Questions for EMC Data Science and Big Data Analytics E20-007 Exam as PDF & Practice Test Engine

  • Exam Code/Number: E20-007
  • Exam Name/Title: Data Science and Big Data Analytics
  • Certification Provider: EMC
  • Corresponding Certification: Data Scientist
  • Exam Questions: 165
  • Updated On: Jun 05, 2026
Which data type value is used for the observed response variable in a logistic regression model?
Correct Answer: D Vote an answer
Refer to the exhibit.

In the exhibit, a correlogram is provided based on an autocorrelation analysis of a sample dataset.
What can you conclude based only on this exhibit?
Correct Answer: D Vote an answer
You have been assigned to do a study of the daily revenue effect of a pricing model of online transactions. When have you completed the analytics lifecycle?
Correct Answer: D Vote an answer
In which phase of the data analytics lifecycle do Data Scientists spend the most time in a project?
Correct Answer: C Vote an answer
Which of the following is an example of quasi-structured data?
Correct Answer: B Vote an answer
In which phase of the analytic lifecycle would you expect to spend most of the project time?
Correct Answer: D Vote an answer
When creating a project sponsor presentation, what is the main objective?
Correct Answer: C Vote an answer
You have been assigned to run a logistic regression model for each of 100 countries, and all the data is currently stored in a PostgreSQL database. Which tool/library would you use to produce these models with the least effort?
Correct Answer: D Vote an answer
You are using k-means clustering to classify heart patients for a hospital. You have chosen Patient Sex, Height, Weight, Age and Income as measures and have used 3 clusters. When you create a pair-wise plot of the clusters, you notice that there is significant overlap between the clusters. What should you do?
Correct Answer: B Vote an answer
Which ROC curve represents a perfect model fit?
A)

B)

C)

D)
Correct Answer: A Vote an answer
You are building a logistic regression model to predict whether a tax filer will be audited within the next two years. Your training set population is 1000 filers. The audit rate in your training data is 4.2%. What is the sum of the probabilities that the model assigns to all the filers in your training set that have been audited?
Correct Answer: A Vote an answer
Refer to the exhibit.

Click on the calculator icon in the upper left corner. You are going into a meeting where you know your manager will have a question on your dataset -- specifically relating to customers that are classified as renters with good credit status.
In order to prepare for the meeting, you create a rule: RENTER => GOOD CREDIT. What is the confidence of the rule?
Correct Answer: A Vote an answer
0
0
0
10