Q1. The total number of missing values in the dataframe are:
a. 8
b. 3
c. 0
d. 50
Q2. The total number of duplicated rows in the dataframe are:
a. 8
b. 0
c. 2
d. 5
Q3. What is the shape of the data after dropping the feature “Unnamed: 0”, missing values and duplicated values?
a. (5000,17)
b. (5581,17)
c. (5578,17)
d .(4581,18)
Q4. What is the average age of the clients those who have subscribed to deposit?
a. 33
b. 41
c. 18
d. 48
Q5. What is the maximum number of contacts performed during the campaign for the clients who have subscribed to deposit?
a. 63
b. 32
c. 10
d. 21
Q6. What is the count of unique education levels in the data and find out how many clients have completed secondary education?
a. 4 & 745
b. 6 & 1871
c .4 & 2717
d. 12 & 245
Q7. What is the percentage split of the categories in the column “deposit”?
a. Yes 47% & No – 53%
b. Yes -40% & No – 60%
c. Yes -50% & No – 50%
d. Yes -30% & No – 70%
Q8. Generate a scatter plot of “age” and “balance” and choose which of the interpretation given below is correct?
a. Across all ages, most of the client’s average yearly balance is less than 20000 euros
b .Across all ages, most of the client’s average yearly balance is greater than 20000 euros
c .As the age increases the bank balance of client increase
d. As the age decrease the bank balance of client decrease
Q9. How many clients with personal loan has housing loan as well?
a. 321
b .397
c. 2606
d. 2254
Q10. How many unemployed clients have not subscribed to deposit?
a. 100
b. 85
c. 78
d. 92
Q11. The command used to convert the categorical variables to indicator variables is
a. pandas.get_indicator(data)
b. pandas.get_dummies(data)
c. pandas.reshape(data)
d. pandas.reshape.module(data)
Q12. The code below is used to get a list of unique column names excluding ‘deposit’ column. Fill in the blanks in the order of the blanks (1st blank, 2nd blank) with appropriate data types as given in the options
features= _ (_(data.columns)-set(['deposit']))
a. set and list
b. set and set
c. list and list
d. list and set
Q13. The command to predict the logistic regression model ‘model’ on test dataset (test) is
a. model.fit(test)
b. model.prediction(test)
c. model.LogisticRegression(test)
d. model.predict(test)
Q14. What is the value of accuracy of the model on the test dataset? (Choose the appropriate range)
a. 20% to 60%
b. 10% to 70%
c. 71% to 85%
d. 86% to 100%
Q15. What is the value of accuracy of the on the test dataset? (Choose the appropriate range)
a. 20% to 40%
b. 41% to 60%
c. 61% to 80%
d. 81% to 100%
Answers:
Q1. b. 3
Q2. c. 2
Q3. c. (5578,17)
Q4. b. 41
Q5. b. 32
Q6. c. 4 & 2717
Q7. a. Yes 47% & No – 53%
Q8. a. Across all ages, most of the client’s average yearly balance is less than 20000 euros
Q9. b. 397
Q10. c. 78
Q11. b. pandas.get_dummies(data)
Q12. d. list and set
Q13. d. model.predict(test)
Q14. c. 71% to 85%
Q15. c. 61% to 80%
Disclaimer: These answers are provided only for the purpose to help students to take references. This website does not claim any surety of 100% correct answers. So, this website urges you to complete your assignment yourself.