Error while trying to synchronize interviews

Our interviewers started getting the following error today:
Synchronization was interrupted, please try again.
No connection to the Survey Solutions Supervisor. Please make sure that the website is available.

The SSL certificates are up to date.
In the logs I have been seeing the last line was “Hosting stopped” and not restarted. Then I tried restarting the web app in IIS as well as iisreset. It restarted and we managed to get a only couple of questionnaires from two tablets (out of many more).
Also in the logs we can see that each job is duplicated into two workspaces while we only have one (we see only the default one in HQ and admin accounts). We created the second one when we experienced some issues last time (ticket name Assignments are not visible in Survey Setup), and we resolved them thanks to your feedback, and then the second workspace was deleted.

The app is deployed on an on-prem server and it was working properly until recently. Maybe it’s due to increased usage? We expect 500-600 interviews to be uploaded, around 20 of them would have a few pictures. The issue was first noticed in the morning today. I can send you the log files from the last three days via email once you reply. Please let me know in case you need any additional information.

After some investigation here are the results:

Initially the problem was due to Let’s Encrypt SSL certificate DST Root CA X3 expiring on 30 September 2021. This certificate was supporting all Android devices up to v. 7.1.1 which do not have the newer root certificates.

As most of our current tablets use Android 7.0 and cannot be upgraded further, I decided to apply a workaround described on official Let’s Encrypt website. They managed to make an unprecedented move together with certificate issuers and extended the applicability of the expired DST Root CA X3 certificate by introducing a longer default certificate chain. However, the problem was that the shorter path #1 was still the default whenever our tablets tried to connect. This is more of a problem with Windows IIS (see this). As can be checked on SSL Server Test (Powered by Qualys SSL Labs) the shorter path is the default one unless some action is taken. Also there seems to be no option of setting a preferred chain in WACS for Windows that was used for our Survey Solutions server.

Therefore, as recommended here, I moved the self-signed ISRG Root X1 (Expires on 04/06/2035) certificate to untrusted ones and added a cross-signed ISRG Root X1 (Expires on 30/09/2024) certificate to intermediary ones that would allow our Android 7 devices to be used for three more years.

The objective is therefore counterintuitive: to remove the more current and shorter path #1 based on the self-signed ISRG Root X1 certificate so that the older Android devices can only see and use the longer certificate chain that they can recognize as secure.

Below are more links in case you want to know more about the issue.

1 Like