Interviewer sync error returning unexpected password error

Dear all,

I am looking for help interpreting a synchronization error in the interviewer app trace log.

We are currently in the field with a survey and our interviewers are having problems synchronizing completed assignments to HQ. The issue is affecting most interviewers, but the severity is idiosyncratic: a few interviewers have 25/50 assignments that won’t sync, while for others it is only 1 or 2 completed assignments. For context, the server application is hosted on our AWS servers.

We followed the troubleshooting guidance here, which did not resolve the issue. I am now reviewing trace logs from one of the interviewers tablets and would appreciate any insights on interpreting the error code. I extracted a portion, which I think shows a failed synchronization, based on my review of this previous discussion in the forum.

17:17:18[EnumeratorAuditLogService][Info][CompleteInterview {"InterviewId":"9bc8c1b7-4f34-4036-98f1-1779ced6ecdd","InterviewKey":"38-55-49-41","Type":"CompleteInterview"}]
17:17:18[EnumeratorAuditLogService][Info][CloseInterview {"InterviewId":"9bc8c1b7-4f34-4036-98f1-1779ced6ecdd","InterviewKey":"38-55-49-41","Type":"CloseInterview"}]
17:17:23[EnumeratorAuditLogService][Info][SynchronizationStarted {"SynchronizationType":"Online","Type":"SynchronizationStarted"}]
17:17:47[InterviewerUploadInterviews][Error][Failed to synchronize interview]WB.Core.SharedKernels.Enumerator.Implementation.Services.SynchronizationException: Votre login ou mot de passe n'est pas correct ---> WB.Core.Infrastructure.HttpServices.HttpClient.RestException: Forbidden ---> WB.Core.Infrastructure.HttpServices.HttpClient.ExtendedMessageHandlerException: Request POST https://XXXXXXXXXXXXXXXX/api/interviewer/v3/interviews/c492b99d-1cd4-4399-b9b1-004032bdb3e0/image failed with status code 403 (Forbidden).
  at WB.Core.Infrastructure.HttpServices.HttpClient.ExtendedMessageHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) [0x0011e] in <00839cd14742427b8ae84a7d9082ef48>:0 
  at System.Net.Http.HttpClient.FinishSendAsyncUnbuffered (System.Threading.Tasks.Task`1[TResult] sendTask, System.Net.Http.HttpRequestMessage request, System.Threading.CancellationTokenSource cts, System.Boolean disposeCts) [0x000b3] in <3ab97adc469048029b91f4dbf8384ccb>:0 
  at WB.Core.Infrastructure.HttpServices.Services.RestService.ExecuteRequestAsync (System.String url, System.Net.Http.HttpMethod method, System.Object queryString, System.Net.Http.HttpContent httpContent, WB.Core.Infrastructure.HttpServices.HttpClient.RestCredentials credentials, System.Boolean forceNoCache, System.Collections.Generic.Dictionary`2[TKey,TValue] customHeaders, System.Nullable`1[T] userCancellationToken) [0x003f7] in <00839cd14742427b8ae84a7d9082ef48>:0 
   --- End of inner exception stack trace ---
  at WB.Core.Infrastructure.HttpServices.Services.RestService.ExecuteRequestAsync (System.String url, System.Net.Http.HttpMethod method, System.Object queryString, System.Net.Http.HttpContent httpContent, WB.Core.Infrastructure.HttpServices.HttpClient.RestCredentials credentials, System.Boolean forceNoCache, System.Collections.Generic.Dictionary`2[TKey,TValue] customHeaders, System.Nullable`1[T] userCancellationToken) [0x005cc] in <00839cd14742427b8ae84a7d9082ef48>:0 
  at WB.Core.SharedKernels.Enumerator.Implementation.Services.EnumeratorSynchronizationService.TryGetRestResponseOrThrowAsync (System.Func`1[TResult] restRequestTask) [0x00080] in <58aad36136cc47d0892eecec2cd4dcfa>:0 
   --- End of inner exception stack trace ---
  at WB.Core.SharedKernels.Enumerator.Implementation.Services.EnumeratorSynchronizationService.TryGetRestResponseOrThrowAsync (System.Func`1[TResult] restRequestTask) [0x0008e] in <58aad36136cc47d0892eecec2cd4dcfa>:0 
  at WB.Core.SharedKernels.Enumerator.Implementation.Services.Synchronization.Steps.UploadInterviews.UploadImagesByInterviewAsync (WB.Core.SharedKernels.Enumerator.Views.InterviewView interview, WB.Core.SharedKernels.DataCollection.WebApi.InterviewUploadState uploadState, System.IProgress`1[T] progress, System.Threading.CancellationToken cancellationToken) [0x00266] in <58aad36136cc47d0892eecec2cd4dcfa>:0 
  at WB.Core.SharedKernels.Enumerator.Implementation.Services.Synchronization.Steps.UploadInterviews.ExecuteAsync () [0x003bb] in <58aad36136cc47d0892eecec2cd4dcfa>:0 |MoveNextRunner.InvokeMoveNext => <ExecuteAsync>d__8.MoveNext => NLogLogger.Error

The first [Error][Failed to synchronize interview] line states Votre login ou mot de passe n'est pas correct, which translates to “your login or password is not correct.” There is also another portion of the line which states image failed with status code 403 (Forbidden).

My naïve interpretation, is that this error is either due to a password problem or a problem with a photograph, which is collected as part of the interview. I am confused by these errors, since the same interviewer had successfully synced other completed interviews and we have received pictures for other completed assignments.

Could you confirm that I have correctly identified the synchronization error in the trace log and advise me on any next steps we can take to resolve this synchronization problem? If my interpretation of the error is incorrect, what should I look for or would you like to see more information from the trace log?

Many thanks,
Anthony

Or some other program/device has responded, not Survey Solutions. See procedure C1 here:

Hi Sergiy,

I implemented procedure C1 from my Admin account for our survey and it is showing “Kestrel” as the server. (I accessed this in Chrome by using right-click, inspect).

Should I instead be asking our data collector to get this information directly from the interviewers tablet?

Thank you @aharris_mpr .

The response looks proper. But I still suspect the 403 was issued by something else. Survey Solutions would not let synchronization to go so far in case the password is incorrect, so it is more likely that a particular query was intercepted and that the interceptor exhibits some more complicated logic than just intercepting everything.

I’ve seen the same 403 in the context of rate-control, where someone has overprotected their server by imposing limits on the number of queries that can be submitted from a particular IP address per each 5-minute period. That can very well derail the synchronization with the symptoms you describe.

Please check whether there is any load balancer or other similar appliance (physical or virtual) that can intercept queries. Once you detect what that is, knowing the time when you attempt to synchronize inspect the log regarding what is being blocked and why.

Best, Sergiy

Hello @sergiy,

Thank you for this suggestion, which led us to a solution.

This was the source of our problem: We are hosting the Survey Solutions application on AWS and the IP addresses for the interviewer tablets were being blocked by the AWS WAF (firewall, I think). Our IT department’s solution was to adjust the Web ACL settings in AWS WAF to make them less restrictive.

After making these adjustments the tablets sync again, except for one case. In this case, the user had 2 complete assignments on their tablet, but only one completed assignment synced.

In reviewing the AWS WAF log, it seems the request was blocked due to the size of the request body being larger than 8KB, combined with another rule that “Inspects for the presence of Local File Inclusion (LFI) exploits in the request body. Examples include path traversal attempts using techniques like …/…/” Is there anything in the structure of a completed assignment that would generate an error like this?

Best,
Anthony

Thank you for providing this additional information. Is there anything notably different between these 2 assignments and the interviews that resulted from them?

I believe the GenericLFI_BODY rule works on it’s own: Baseline rule groups - AWS WAF, AWS Firewall Manager, and AWS Shield Advanced

To avoid guessing follow up on “…the request was blocked…” and see what was the specific request that was blocked, the log files of the Survey Solutions App will show our side of the story, and the WAF logs will reflect what was actually blocked.

Best, Sergiy

I just received a message from the data collection team to say that the last interview synced. We had asked them to try syncing only once per day, which may have helped.

I will update this post with any information they provide us that suggests a difference between the two cases. We collect photographs under certain conditions, which have been the difference here. However, this is just speculation.

In case it’s useful to others, we found it helpful to ask the interviewer to try synchronizing at a pre-agreed time so that our IT team could immediately go into the AWS WAF logs and find the blocked request.

Best,
Anthony

To clarify: this is advised here for troubleshooting of the above described issue. The Survey Solutions itself allows the interviewers to synchronize at arbitrary times, at will, whenever they see convenient.

To avoid scheduling synchronization at particular times for particular interviewers when troubleshooting this issue, the clocks of the tablets should be coordinated (matched) with the server clock, and then the IT team be notified at what time/date the attempt to sync was made. That information the interviewers should take note of, when they are doing a synchronization.