Dear Sergiy, in my case I have tried to approximate the total application duration by adding a sum of the response durations in each question, excluding extreme values that could correspond to interruptions in the interview or exceptions in the response sequence. To obtain the response time for each question, I have ordered, according to the response sequence established in the protocol, and calculated the time difference between the “answerSet” events of each question and the previous question. Here I add the code in R that I have used:
filter(event == "AnswerSet", role == 1) %>%
mutate(date = as.POSIXct(as_datetime(fast_strptime(as.character(timestamp), format = "%Y-%m-%dT%H:%M:%S", tz = "UTC"), tz = "UTC")))
arrange(interview__id, date) %>%
mutate(no = row_number(),
duration = difftime(dplyr::lead(date, n = 1, default = date, order_by = no), date, units = "secs"))
I think an algorithm could be used to make a better approximation of the duration. I would like to know the algorithm used by Survey Solutions so I can comment and contribute.