Export of missing values in filtered multiple select questions

Issue

User Cyriane has reported an issue with exporting a Y/N-multiple select question while under filtering. Specifically, the items of the question which are filtered out (not available for selection in the interface) are correctly exported as missing, but with an incorrect type of the missing value (as per Missing values article in the documentation site).

This issue has been confirmed.

Explanation

The reason for this happening is that the exporting subsystem of Survey Solutions only has access to what the answers have been set (whether positively or negatively), and doesn’t know what was the whole universe of choices available at that time for the user. That is possible in principle, but would require recalculation of all the conditions involved in all filters, and would stall the export process. That’s why it is only done in the interface of the data entry question-by-question when needed to display the relevant options on the screen for interviewers/supervisors. But the exporting procedure falls back on a specific value, disregarding the effect of the filtering.

It does sound complicated, and luckily most users won’t ever need to bother about this issue, since the value is missing in either case and can not be used in substantive analysis. But in rare cases where there is a specific need to determine the reason for being missing, we may need to recover it after the data has been exported.

Example

Consider the following example, the multiple select question IT_1_1_2_1 has the items 101, 102, 201, 202, 301, 302, which are being filtered using the value of the geo_location variable using the following filter:

(int)@optioncode/100 == (int)geo_location/100

This results in the following (hypothetical) export data (as seen in Stata, one will see equivalent -999,999,999 value in Excel instead of extended missing value .a used in Stata):

and Cyriane’s target is this:

Solution

The example that fixes the exported data in this case can be run as:

do "http://www.radyakin.org/suso/forum/multi_patch.do"

The bulk of the code is actually entering the data, but the recovery itself happens in lines 36-45.
Importantly, the question’s variable name (as specified in the Designer) is mentioned in line 36, and the filter (in a negated form) mentioned in line 44.

The code will automatically determine the variables that comprise the export of this multiple select question, and re-play the filter based on the variable (in this case geo_location).

The same block may be repeated for other multiple select questions where the same problem is experienced. Note, that if the filter uses variables from other data levels, they need to be merged to the current file, as they will be exported to different files.

In many cases the fix is trivial, but sometimes it may get somewhat more elaborate, depending on the complexity of the original filtering expression, and the capabilities of the system where the filtering expression gets replayed.

There are no immediate plans to change the behavior of Survey Solutions with this regard, but the limitation of this case will be more prominently reflected in the documentation.