Home »
MCQs
Reinforcement Learning Multiple-Choice Questions (MCQs)
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. (Read More.)
Reinforcement Learning MCQs: This section contains multiple-choice questions and answers on the various topics of Reinforcement Learning. Practice these MCQs to test and enhance your skills on Reinforcement Learning.
List of Reinforcement Learning MCQs
1. Reinforcement learning is a ____
- Prediction-based learning technique
- Feedback-based learning technique
- History results-based learning technique
Answer: B) Feedback-based learning technique
Explanation:
Reinforcement learning is a feedback-based learning technique.
Discuss this Question
2. How many types of feedback does reinforcement provide?
- 1
- 2
- 3
- 4
Answer: B) 2
Explanation:
Reinforcement learning gives two types of feedback: positive and negative.
Discuss this Question
3. Which kind of data does reinforcement learning use?
- Labeled data
- Unlabelled data
- None
- Both
Answer: C) None
Explanation:
Reinforcement learning does not use any type of data.
Discuss this Question
4. Reinforcement learning methods learned through ____?
- Experience
- Predictions
- Analyzing the data
Answer: A) Experience
Explanation:
Reinforcement learning learns through experience.
Discuss this Question
5. How many types of machine learning are there?
- 2
- 3
- 4
- 5
Answer: C) 4
Explanation:
Four types of machine learning are there: Supervised, unsupervised, semi-supervised, and reinforcement.
Discuss this Question
6. Which of the following is the practical example of reinforcement learning?
- House pricing prediction
- Market basket analysis
- Text classification
- Driverless cars
Answer: D) Driverless cars
Explanation:
Driverless cars are the product of reinforcement learning concepts.
Discuss this Question
7. What is an agent in reinforcement learning?
- Agent is the situation in which rewards are being exchanged
- Agent is the simple value in reinforcement learning.
- An agent is an entity that explores the environment.
Answer: C) An agent is an entity that explores the environment.
Explanation:
An agent is an entity that explores the environment.
Discuss this Question
8. What is the environment in reinforcement learning?
- Environment is a situation that is based on the current state
- Environment is a situation in which an agent is present.
- Environment is similar to feedback
- Environment is a situation that the agent returns as a result.
Answer: B) Environment is a situation in which an agent is present.
Explanation:
Environment is a situation in which an agent is present.
Discuss this Question
9. What are actions in reinforcement learning?
- Actions are the moves that the agent takes inside the environment.
- Actions are the function that the environment takes.
- Actions are the feedback that an agent provides.
Answer: A) Actions are the moves that the agent takes inside the environment.
Explanation:
Actions are the moves that the agent takes inside the environment.
Discuss this Question
10. What is the state of reinforcement learning?
- State is a situation in which an agent is present.
- A state is the simple value of reinforcement learning.
- A state is a result returned by the environment after an agent takes an action.
Answer: C) A state is a result returned by the environment after an agent takes an action.
Explanation:
A state is a result returned by the environment after an agent takes an action.
Discuss this Question
11. What are the Rewards of Reinforcement learning?
- An agent's action is evaluated based on feedback returned from the environment.
- Environment gives value in return which is known as a reward.
- A reward is a result returned by the environment after an agent takes an action.
Answer: A) An agent's action is evaluated based on feedback returned from the environment.
Explanation:
An agent's action is evaluated based on feedback returned from the environment is known as rewards.
Discuss this Question
12. What is the Policy in reinforcement learning?
- The agent's policy determines what environment model should be decided
- The agent's policy determines what action to take based on the current state.
- The agent's policy determines what the state reward would be.
Answer: B) The agent's policy determines what action to take based on the current state.
Explanation:
The agent's policy determines what action to take based on the current state.
Discuss this Question
13. Does reinforcement learning follow the concept of the Hit and try method?
- Yes
- No
Answer: A) YES
Explanation:
Yes, reinforcement learning follows the concept of the hit-and-try method.
Discuss this Question
14. In how many ways can you implement reinforcement learning?
- 2
- 3
- 4
- 5
Answer: B) 3
Explanation:
In three ways we can implement reinforcement learning:
- Value-based
- Policy-based
- Model-based
Discuss this Question
15. In which of the following approaches of reinforcement learning, do we find the optimal value function?
- Value-based
- Policy-based
- Model-based
Answer: A) Value-based
Explanation:
In a Value-based approach to reinforcement learning, we find the optimal value function.
Discuss this Question
16. How many types of policy-based approaches are there in reinforcement learning?
- 1
- 2
- 3
- 4
Answer: B) 2
Explanation:
There are two types of policy-based approaches:
Discuss this Question
17. In which of the following approaches of reinforcement learning, a virtual model is created for the environment?
- Value-based
- Policy-based
- Model-based
Answer: C) Model-based
Explanation:
Model-based approach of reinforcement learning, a virtual model is created for the environment.
Discuss this Question
18. ____ is a synonym for random and probabilistic?
- Deterministic
- Stochastic
Answer: B) Stochastic
Explanation:
Stochastic is a synonym for random and probabilistic variables.
Discuss this Question
19. How many elements does reinforcement learning consist of?
- 2
- 3
- 4
- 5
Answer: C) 4
Explanation:
Mainly there are four types of reinforcement learning:
- Policy
- Reward Signal
- Value Function
- Model of the environment
Discuss this Question
20. The agent's main objective is to ____the total number of rewards for good actions.?
- Minimize
- Maximize
- Null
Answer: B) Maximize
Explanation:
The agent's main objective is to maximize the total number of rewards for good actions.
Discuss this Question
21. Reinforcement learning is defined by the ____?
- Policy
- Reward Signal
- Value Function
- Model of the environment
Answer: B) Reward Signal
Explanation:
Reinforcement learning is defined by the Reward signal.
Discuss this Question
22. Which element in reinforcement learning defines the behavior of the agent?
- Policy
- Reward Signal
- Value Function
- Model of the environment
Answer: A) Policy
Explanation:
Policy elements in reinforcement learning define the behavior of the agent.
Discuss this Question
23. Can reward signals change the policy?
- Yes
- No
Answer: A) YES
Explanation:
Reward signals can change the policy.
Discuss this Question
24. On which of the following elements of reinforcement learning, the reward that an agent can expect is dependent?
- Policy
- Reward Signal
- Value Function
- Model of the environment
Answer: C) Value Function
Explanation:
On the value function, the reward that the agent can expect is dependent.
Discuss this Question
25. Which of the following elements of reinforcement learning imitates the behavior of the environment?
- Policy
- Reward Signal
- Value Function
- Model of the environment
Answer: D) Model of the environment
Explanation:
The model imitates the behavior of the environment.
Discuss this Question
26. The approach in which reinforcement learning problems are solved with the help of models is known as ____?
- Model-based approach
- Model-free approach
- Model known approach
Answer: A) Model-based approach
Explanation:
The approach in which reinforcement learning problems are solved with the help of models is known as model-based approach.
Discuss this Question
27. Who introduced the Bellman equation?
- Richard Ernest Bellman
- Alfonso Shimbel
- Edsger W. Dijkstra
Answer: A) Richard Ernest Bellman
Explanation:
Richard Ernest Bellman introduced the Bellman equation.
Discuss this Question
28. Gamma (γ) in the bellman equation is known as?
- Value factor
- Discount factor
- Environment factor
Answer: B) Discount factor
Explanation:
Gamma (γ) in the bellman equation is known as the Discount factor.
Discuss this Question
29. How many types of reinforcement learning?
- 3
- 4
- 2
- 5
Answer: C) 2
Explanation:
There are two types of reinforcement learning:
- Positive Reinforcement
- Negative Reinforcement
Discuss this Question
30. In which of the following types of reinforcement learning do we add something that increases the likelihood of repeating expected behavior?
- Positive Reinforcement
- Negative Reinforcement
Answer: A) Positive Reinforcement
Explanation:
In positive reinforcement learning types of reinforcement learning we add something that increases the likelihood of repeating expected behavior.
Discuss this Question
31. How do you represent the agent state in reinforcement learning?
- Discount state
- Discount factor
- Markov state
Answer: C) Markov state
Explanation:
Represent the agent state in reinforcement learning Markov state.
Discuss this Question
32. P[St+1 | St ] = P[St +1 | S1,......, St], in this condition
What is the meaning of St?
- State factor
- Discount factor
- Markov state
Answer: C) Markov state
Explanation:
P[St+1 | St ] = P[St +1 | S1,......, St], in the following condition St represents the Markov state.
Discuss this Question
33. What do you mean by MDP in reinforcement learning?
- Markov discount procedure
- Markov discount process
- Markov deciding procedure
- Markov decision process
Answer: D) Markov decision process
Explanation:
MDP stands for Markov decision process.
Discuss this Question
34. Why do we use MDP in reinforcement learning?
- We use MDP to formalize the reinforcement learning problems.
- We use MDP to predict reinforcement learning problems.
- We use MDP to analyze the reinforcement learning problems.
Answer: A) We use MDP to formalize the reinforcement learning problems.
Explanation:
We use MDP to formalize the reinforcement learning problems.
Discuss this Question
35. How many tuples does MDP consist of?
- 2
- 3
- 4
- 5
Answer: C) 4
Explanation:
MDP consists of 4 tuples:
- A set of finite States S
- A set of finite Actions A
- Rewards received after transitioning from state S to state S', due to action a.
- Probability Pa.
Discuss this Question
36. Which of the following algorithms will find the best course of action, based on the agent's current state, without using a model and off-policy reinforcement learning?
- Q-learning
- Markov property
- State action reward state action
- Deep Q neural network
Answer: A) Q-learning
Explanation:
A Q-learning algorithm will find the best course of action, based on the agent's current state, without using a model and off-policy reinforcement learning.
Discuss this Question
37. What do you mean by SARSA in reinforcement learning?
- State action reward state action
- State achievement rewards state action
- State act reward achievement
- State act reward act
Answer: A) State action reward state action
Explanation:
SARSA stands for State action reward state action.
Discuss this Question
38. ___ is the policy that an agent is trying to learn?
- behavior policy
- Target policy
- On-policy
- Off-policy
Answer: B) Target policy
Explanation:
A target policy is a type of policy that an agent is trying to learn.
Discuss this Question
39. ____- is the policy which is used by an agent for action selection?
- behavior policy
- Target policy
- On-policy
- Off-policy
Answer: A) behavior policy
Explanation:
Behavior policy is used by an agent for action selection.
Discuss this Question
40. Which of the following type of policy is a learning algorithm in which the same policy is improved and evaluated?
- behavior policy
- Target policy
- On-policy
- Off-policy
Answer: C) On-policy
Explanation:
On-policy type of policy is a learning algorithm in which the same policy is improved and evaluated.
Discuss this Question
41. Which of the following types of policy is a learning algorithm that evaluates and improves a policy that is dissimilar from the Policy that is used for action selection?
- behavior policy
- Target policy
- On-policy
- Off-policy
Answer: D) Off-policy
Explanation:
Off-policy is a type of policy, is a learning algorithm that evaluates and improves a policy that is dissimilar from the Policy that is used for action selection.
Discuss this Question
42. Among On-policy and off-policy, which of the following target policy is not equal to behavior policy?
- On-policy
- Off-policy
Answer: B) Off-policy
Explanation:
In an off-policy learning algorithm target policy is not equal to behavior policy.
Discuss this Question
43. Among On-policy and off-policy, which of the following target policy is equal to behavior policy?
- On-policy
- Off-policy
Answer: A) On-policy
Explanation:
In the on-policy learning algorithm target policy is equal to behavior policy.
Discuss this Question
44. Q-learning follows an on-policy learning algorithm or an off-policy learning algorithm?
- On-policy
- Off-policy
Answer: B) Off-policy
Explanation:
Q-learning is based on an off-policy learning algorithm.
Discuss this Question
45. SARSA follows an on-policy learning algorithm or an off-policy learning algorithm?
- On-policy
- Off-policy
Answer: A) On-policy
Explanation:
SARSA is based upon an on-policy learning algorithm.
Discuss this Question
46. What is DQN in reinforcement learning?
- Dynamic Q-learning network
- Dynamic Q-neural network
- Deep Q-neural network
Answer: C) Deep Q-neural network
Explanation:
DQN stands for Deep Q-neural network.
Discuss this Question
47. Which of the following correctly states the difference between Q-learning and SARSA?
- In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal
- In comparison to QL, SARSA directly learns the optimal policy, whereas QL learns a policy that is "near" the optimal.
Answer: A) In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal
Explanation:
In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal.
Discuss this Question
48. Which of the following gives the better final performance?
- QL
- SARSA
Answer: A) QL
Explanation:
Q-learning (QL) gives a better final performance.
Discuss this Question
49. Which of the following is faster?
- QL
- SARSA
Answer: B) SARSA
Explanation:
SARSA is faster.
Discuss this Question
50. Q-learning is a model-free or model-based learning algorithm?
- Model-free
- Model-based
Answer: A) Model-free
Explanation:
Q-learning is a model-free learning algorithm.
Discuss this Question
51. What does Q stand for in Q-learning?
- Quality
- Query
- Quantify
- Quick
Answer: A) Quality
Explanation:
In Q-learning "Q" stands for quality.
Discuss this Question
52. The matrix created during the Q-learning algorithm is commonly known as ____?
- Query-table
- Q-table
- Quick-matrix
- Table
Answer: B) Q-table
Explanation:
The matrix created during the Q-learning algorithm is commonly known as the q-table.
Discuss this Question
53. Does reinforcement learning provide any previous training?
- Yes
- No
Answer: B) NO
Explanation:
No, reinforcement learning does not require any previous training.
Discuss this Question
54. Q-learning works on which equation?
- Naïve bayes equation
- KNN-equation
- Bellman-equation
Answer: C) Bellman-equation
Explanation:
Q-learning works on the Bellman equation.
Discuss this Question