On importance of range checks

Range checks are important. They help detect and prevent gross errors caused by typos, omitted decimal separator, and other common real world data entry problems.

The check is usually a simple expression like:
hhsize.InRange(1,20)
which requires the value of the household size to be between 1 and 20 persons.

It makes sense to always include such a check in a Survey Solutions questionnaire for numeric questions by attaching validations. This is because almost any number in socio-economics is bounded, and often to a quite a reasonable range, so that it is totally worth a while to incorporate such a check in the data entry procedure.

This would help avoid the situation like depicted on the screenshot below from the AfDB website:

The check could have easily used the approximate known population of the largest countries (China, India) of under 1.5bln, or the population of the whole world (<8bln). Either one would have caught the error, as there is no reason to believe the population of Zambia has exploded about 1mln times between 2016 and 2017.

The above data is not coming from Survey Solutions, but shows the importance of at least some basic checks on all the data being put into your systems, and/or reports/outputs that they produce. This is the case where the error is obvious, due to a large magnitude of the difference. But in many cases the errors may remain unnoticed if they are caused by e.g. a single digit mistyped. In some cases such errors are caused by an abrupt change of the interface or the standard, which allows several systems to talk to each other exchanging data, such as various API functions.

When working with time series, such as price surveys, one can usually formulate range checks by allowing certain percent deviations from the earlier collected price values, or utilize various other items from the same group as a benchmark (for example, using price of CocaCola and Sprite to cross-check the price of Fanta, all equivalent volumes.