2.1.3. Significance of Hypothesis Generation#
What is Hypothesis Generation?
Hypothesis generation is an educated guess of various factors that are impacting the business problem that needs to be solved using machine learning.
In framing a hypothesis, the data scientist must not know the outcome of the hypothesis that has been generated based on any evidence. Hypothesis generation is a crucial step in any data science project
Why should you care about it?
It provides a framework for scientific inquiry: Hypotheses provide a starting point for scientific inquiry and help to guide the research process. Without a hypothesis, researchers may not know where to begin or what questions to ask.
It helps to focus research efforts : Hypotheses help to focus research efforts on specific questions or problems, which can lead to more efficient and effective research.
It allows for testing of predictions : Hypotheses make specific predictions about the outcome of research, which can be tested through experimentation or observation. This helps to determine whether the hypothesis is supported or refuted by the data.
It can lead to new discoveries : Hypotheses can lead to unexpected discoveries or insights that may not have been uncovered without a clear research question or prediction.
When Should you Perform Hypothesis Generation?
The hypothesis generation should be made before looking at the dataset or collection of the data
You will notice that if you have done your hypothesis generation adequately, you would have included all the variables present in the dataset in your hypothesis generation