Q1. The total number of missing values in the dataframe are:
a. 8
b. 10
c. 0
d. 50
Q2. The total number of duplicated values in the dataframe are:
a. 0
b. 50
c. 15
d. 21
Q3. What is the shape of the data after dropping the feature “Unnamed: 0”, missing values and duplicated values?
a. (5000,17)
b. (5581,17)
c. (5580,18)
d .(4581,18)
Q4. What is the average age of the clients those who have subscribed to deposit?
a. 39
b. 49
c. 32
d. 41
Q5. What is the maximum number of contacts performed during the campaign for the clients who have not subscribed to deposit?
a. 63
b. 32
c. 10
d. 5
Q6. What is the difference between the maximum balance (in euros) for the clients who have subscribed to deposit and for the clients who have not subscribed to the deposit?
a. 1747
b. 1514
c. 24373
d. 75054
Q7. What is the count of unique job levels in the data and find out how many clients are in the management level?
a. 10 & 1318
b. 12 & 3134
c. 13 & 2000
d. 12 & 1318
Q8. What is the percentage split of the categories in the column “deposit”?
a. Yes- 47% & No- 53%
b. Yes – 40% & No- 60%
c. Yes – 50% & No- 50%
d. Yes – 30% & No- 70%
Q9. Generate a scatter plot of “age” vs “balance” and choose which of the following interpretation is correct?
a. Across all ages, most of client’s average yearly balance is greater than 20000 euros
b. As the age increases the bank balance of client increase
c. Across all ages, most of the client’s average yearly balance is less than 20000 euros
d. As the age decreases the bank balance of client decrease
Q10. How many unemployed clients have subscribed to deposit?
a. 100
b. 85
c. 78
d. 92
Q11. The command used to convert the categorical variables to indicator variables is
a. pandas.get_indicator(data)
b. pandas.get_dummies(data)
c. pandas.reshape(data)
d. pandas.reshape.module(data)
Q12. The code below is used to get a list of unique column names excluding ‘deposit’ column. Fill in the blanks in the order of the blanks (1st blank, 2nd blank) with appropriate data types as given in the options
features= ____ (____(data.columns)-set(['deposit']))
a. set and list
b. set and set
c. list and list
d. list and set
Q13. The command to predict the logistic regression model ‘model’ on test dataset (test) is
a. model.fit(test)
b. model.prediction(test)
c. model.predict(test)
d. model.LogisticRegression(test)
Q14. What is the value of accuracy of the model on the test dataset? (Choose the appropriate range)
a. 20% to 60%
b. 61% to 70%
c. 71% to 90%
d. 91% to 100%
Answers:
Q1. c. 0
Q2. a. 0
Q3. b. (5581,17)
Q4. d. 41
Q5. a. 63
Q6. c. 24373
Q7. d. 12 & 1318
Q8. a. Yes- 47% & No- 53%
Q9. c. Across all ages, most of the client’s average yearly balance is less than 20000 euros
Q10. d. 92
Q11. b. pandas.get_dummies(data)
Q12. d. list and set
Q13. c. model.predict(test)
Q14. 71% to 90%
Disclaimer: These answers are provided only for the purpose to help students to take references. This website does not claim any surety of 100% correct answers. So, this website urges you to complete your assignment yourself.