In many situations where there we are observing adoption and diffusion of new technologies we may run into the problem of endogeneity. This is a quite serious problem as endogeneity (if present) introduces bias into any statistical/regression procedure and thus conclusions derived from such numbers are quite suspect. There are many causes including data sampling problems and omitted variables left out in our data collection and analysis.
If we are talking about the adoption of new agricultural technologies, and even if we conducted our study with a standard of excellence in terms of sampling methods (i.e. stochastic or stratified random sampling), we run the risk of having the situation where we are actually collecting data on the first adopters who may be the best farmers to begin with. Thus any differences between the outcome variable (yield, profit, net value, income) estimated for the “new” versus the “old” technology may be incorrectly estimated.
In fact, even if we avoid the issue of having the best first adopting farmers, the fact is that farmers themselves select which group they belong to (the “new” versus the “old” technology user), thus they are not assigned randomly to their respective group by the researcher, as one would do in an agronomic experiment where one assigns the treatments and control randomly, violating the full random requirement for conducting statistical analysis. This is the “self-selection” problem in statistics.
Endogeneity is a problem especially when adoption and diffusion has already occurred, in essence, in ex-post socio-economic assessments. It may be a problem in an ex ante assessment, if one conducts a baseline determining survey, and one is not careful enough to model the variables determining outcomes or any of the other causal motives behind endogeneity. Any ex ante estimations using a biased survey as a baseline data source may introduce noise into the results and thus confound what may be the impact of the technology versus the impact of other variables on outcome.
Researchers have used instrumental variables, Heckman two stage regressions, and control functions to deal with endogeneity in the past. More details on these methods later. In the mean time to hear a bit of a “tongue in cheek” description of endogeneity you can visit Endogeneity: An inconvenient truth (full version), by John Antonakis from the University of Lausanne. If you are interested in the instrumental variables approach to deal with endogeneity you can see this videon Instrumental variables