I have a doubt that I wasn’t able to resolve with the documentation and past forum discussion.
We are fielding a survey and having trouble reconciling the data on survey duration.
Here’s an example: this survey shows a duration of 2 minutes, but the paradata shows that it lasted 1hr and 40min (see picture below).
Having noted the issue in the past, this survey was conducted by an enumerator under the supervision of a monitor who kept record of start/end time. The paradata matches our observations, but the duration displayed on the website is confusing.
Am I missing something? Is there a way to reconcile these different durations?
Thanks!
Andrea
According to the official documentation, the interview duration (which you can more accurately track in the interview__duration variable of the interview__diagostics file), interview duration is the “active time it took to complete the interview”. See documentation below and look for the interview__diagnostics section.
In light of this:
I believe this same interview__duration is the one shown in the interviewer webview, as in your screenshot.
The “paradata” in your screenshot, such as for “Created”, “Completed”, “Approved” etc., are simply timestamps that correspond to specific actions.
Created tells you the point in time at which an assignment was created
Completed tells you the point in time at which the corresponding interview was completed and synchronized. Etc…
This information can also be found in the system-generated file interview__actions.
Which one you should use depends on what you want to measure
Use the interview__duration from the diagnostics file when you want to know exactly how long an interview lasted.
Use the timestamp paradata in the interview__actions when you want to measure time intervals between different actions.
The former gives you a better idea about actual interview duration. The latter simply captures timing of action. If I receive an assignment today, but I complete and sumbit it 3 days later, it does not mean that the interview duration was 3 days.
Hello @giansib - thanks, this is helpful.
I guess I’m more puzzled by the fact that we observed the survey being open on the tablet for a time that matches the paradata, but the interview__diagnostics does not reflect this. Surely there’s an issue somewhere - not sure whether with the hardware or software.
Based on what I can infer about the size of your questionnaire in the screenshot, 2 minutes does seem too short! Perhaps a Survey Solutions support staff has some more input here.
Clicking on the Download Interview Transcript will give you an easy way of checking.
Realistically, I don’t see a way somebody could answer 309 questions in 2 minutes.
But then again, you are comparing apples and oranges. The Interview status history shows you when certain events have happened, and this doesn’t mean that the space in-between is to be prescribed to a certain activity - the interviewer may have very well switched to a different interview.
Regarding paradata you are probably not using this, based on the screenshots. What matters is the paradata export file itself, which you can download to see the recording of actions in a particular interview. Then, once you have it, it will be easier to detect any discrepancies (if any).
Realistically, I don’t see a way somebody could answer 309 questions in 2 minutes.
I agree, that’s why I wrote
I should stress, one of our team members was with the enumerator and confirmed they did not take 2 second to run the survey. The paradata also confirm this.
Going through the paradata, I did however notice something unexpected: the “order” variable and the “timestamp_utc” variables are NOT perfectly aligned. That is, looking at the timestamps, event 446-464 happened in between 440 and 441. Any thoughts as to why this might be the case?
It does explain. There was no interviewing between 13:40 and 14:52 as evidenced by the Paused/Resumed pair of events, shown in below in the green block.
I’d be more puzzled why the total is 2 minutes and not 3 minutes and 45 seconds.
You may want to find out from the staff what was happening in the red block, that the clock got adjusted.
From the practical standpoint, I’d exclude such interviews from processing, as manipulating the clock will undermine the indicators you are trying to derive from this, like duration of interview.
Hello Sergiy,
indeed, clock adjustments were also my prior - even from the actions data, we sometime see surveys completed earlier than when they started. That’s why we sent monitors on the field.
I got confirmation that there was no funny business (during the monitoring). I guess it must be an issue with the tablets.
Going back to the example, though, what I showed is an extract of the paradata - there were ~400 events before the ones shown, the first one at 12:53. Even accounting for pauses, the total running time was more than an hour. Is there any particular reason why the duration shown on the portal would be only 2min? Even allowing for clock adjustments etc, I cannot quite grasp why the duration recorde would be so low.
We are discussing the issue of low survey durations with the survey firm we contracted, and I cannot quite square this data.
here’s the full example, extracted from the dataset.
it includes interview_actions, interview_diagnostics and paradata. I have removed the “parameter” field from the latter file as it includes PIIs
Events [6,89] are pre-filled questions.
Survey Solutions averaged added up this to obtain 149 seconds, which is 2*60+29=2m29s or about 2 minutes as you are seeing in the Survey Solutions interface.
Any adjustments of the tablet clock invalidate the duration as calculated using this method.
Any period between Resumed and Paused events is counted as an editing session. If the Paused event is missing (in rare cases where the tablet for example completely ran out of power) Survey Solutions will generate a synthetic Paused event with some assumptions: Adds 15 minutes to that session as per this file (see around line 980). This is rare.