This work with a relational database allowed me to begin working with statistical analysis. One afternoon sitting in a tree at Bent Creek Lodge (
www.BentCreekLodge.com) in about 1987, I had a bang of an idea. I would write a computer database program that would allow the lodge to manage the details of their hunting operation, and we would tie this database into the NOAA (National Oceanic & Atmospheric Administration) database of weather elements. Herein lay the opportunity to take advantage of thousands of hunter days of experience without my having to do all the work. That night after supper I talked over the idea with the lodge owners, Johnny Lanier and Leo Allen, and they agreed with the concept. It took me the better part of the following year to pull it off, but by the fall season of 1988, I had it ready. Every morning and every afternoon, the lodge would send about twenty-five hunters into the woods to hunt on the massive 45,000-acre expanse of the lodge’s land holdings. Each day at lunch, we would pull down the NOAA weather variables …temperature, barometric pressure, wind direction, wind velocity, moonrise time, moonset time, moon phase, cloud cover conditions, precipitation, and integrate these data into our hunting database. Bear in mind, this was before the World Wide Web was in existence. When the hunters returned to the lodge for lunch, the lodge guides would accumulate data on what they had seen as well as what they killed, and this data was then linked to the weather information from the NOAA database. The same was the case each afternoon. Most commercial hunting operations in the South tend to place their hunters in the woods in the morning and on fields in the afternoon. Armed with the data from this very analytical approach, I could begin to unravel which variables really had an impact on whether we saw deer, but also even why we saw them. Because we have captured both sighting and kill data, we could even determine statistical data on kills by age class. For example, what are the statistical odds of taking a buck in the three and a half-year age class under a given set of conditions? Computerized Deer Hunting Boy, did I get an education. Obviously, with years of hunting success and experience, I had a lot of preconceived ideas about when deer would get up and walk around in the daytime and even thought I knew why (under what conditions) they might do so, at least some of the time. After compiling the first two years of data, I began to analyze it. The moon has fascinated hunters for centuries, and there is much published about the lunar effect on the earth’s animals. I was eager to see what the data confirmed with respect to the whitetail. For example, based upon my hunting experience, I would have thought that hunting success during the day following a night of bright moon phase would not be as good. In other words, if the deer had good light through the night, they would be up feeding all night, and hence hunting the following day would not be as good. Indeed, when I looked at the numbers under varying conditions, the days following dark moon phases were clearly better with nearly a fourfold increase in the likelihood of success on days following dark moon phases. The data confirmed my experience. In years to follow I started to look more specifically at moonrise times and other variables such as cloud cover, precipitation, wind velocity, barometric pressure, and temperature. Then suddenly, a fly appeared in the ointment. During the fourth and fifth years of data collection, the effect of nighttime moon phase on the statistical odds of seeing and/or taking a deer reversed! In other words, the best hunting days during those seasons were the days following the nights of bright moon phase. I was stumped. By this time we had accumulated more than five thousand hunter days (the equivalent of you or me going hunting five thousand times, writing down what we saw and/or killed each time we went, and then matching this to the daily weather patterns). This allowed us to move from univariate to multivariate statistical analytical methods. If you have a data set with only a few hundred pieces of data, you really only have enough statistical power to analyze a single variable like, say, wind direction. However, you have no way of knowing for certain whether the variable you are focused upon is having a cause and effect relationship upon the variable you are studying or just happens to be a marker for that effect...
...As you can see, having a data set large enough to utilize a multivariate approach to data analysis can keep you from drawing some very wrong conclusions. Now, to go back to our example of the moon phase, I had drawn the conclusion based upon my hunting experience, that when the moon phase was bright at night, the effect on hunting the following day was bad. In other words, I drew a cause and effect conclusion based upon my observations of a data set that was both too small (only a few hundred hunting trips) and univariate (looking at only one variable without knowing what effect other variables might be simultaneously having). By the beginning of the fifth year of data collection, we had more than five thousand hunting trips to analyze. In addition, we had collected data not just on the one variable (the moon phase), but also the moonrise time, moonset time, wind direction, wind velocity, barometric pressure, temperature, temperature change, cloud cover, precipitation, etc. We had enough data to hold certain variables within a set range while simultaneously checking for the true cause and effect another variable might have on our outcome (the odds of seeing and/or taking a whitetail in the daytime). It turned out that purely by chance, the days following the bright moon phases of those first two seasons were relatively warm. The days following the bright moon phases of the third and fourth seasons were relatively cold. Hence, it turned out to be the temperature that was driving deer sightings rather than the moon phase. In subsequent seasons, I’ve seen years when the data were mixed. In other words, some days following bright moon phases were warm while others were cold. Every time we look at the variables across large subsets of data using the multivariate approach, it is the temperature that seems to have a cause and effect relationship upon deer sightings/kills, not the moon phase! Let me say this another way to make it more clear. If I look at the data from hunting trips on days following nights with a bright moon phase, but warm weather, the sighting/kill results are dismal. However, if I look at hunting trips on days following nights with a bright moon phase, but cold weather, the results are stunningly good. That’s what I mean by a true cause and effect relationship. Hunters, by our very nature, are observers. We go to the woods, see deer activity, observe what time of day it occurred, notice how cold it was, how windy it was, etc., and based upon those observations involving multiple factors, draw certain conclusions. The problem is that just like with our computer based data observation, we sometimes are looking at the right variables, but still drawing the wrong conclusions.