First, let's consider your
data set. Of the ~903,000 calls in your initial data set almost half
were excluded from the analysis for a variety of reasons. Whenever data
is dropped, there is the strong possibility that
what remains is a non-random (and thus biased) set of data. Furthermore,
the remaining data points "do not measure crime" (as belatedly stated
in the 30th inch of the story) -but instead capture a wide variety of
incidents (including "enforcement of traffic
laws" and "attend at collisions" that are not necessarily linked to the
residents of that region). It should go without saying that if your
data does not contain variables are relevant to the question, then the
conclusions drawn from them will be suspect.
No comments:
Post a Comment