Question type to upload file

Proposal

New question type to upload external files. Ideally, abitrary files of arbitrary type and size. At a minimum, upload files with gpx extension. In effect, a question type that mirrors ODK’s file upload widget.

If loading arbitrary file types is possible, allow the Designer user to specify, optionally, the file extensions or names allowed (e.g., .gpx extension, files ending in "[0-9]{3,6}\\.gpx", etc.). This could be implemented either as a filter on the local file system or a validation.

Affected subsystems

This will likely affect all subsystems, since the proposal involves creation of a new question type.

Users

Survey Solutions offers a comprehensive set of question types for general use. But increasingly users need to capture data that Survey Solutions cannot capture natively–most often data collected via direct measurement with external instruments or specialized sensors (e.g., measuring area with handheld GPS, weight with scales, etc.).

In an ideal world, Survey Solutions would offer question types for all of these niche needs. In a more realistic world, Survey Solutions would provide a mechanism to capture and store the outputs of external measurement instruments. The most universal means would be to upload arbitrary files produced by those external instruments.

While this general proposal likely has general utility, I see one set of users that would benefit greatly: agricultural surveys that measure plot area with handheld GPS devices and produce shapefiles of plot outlines from this exercise (e.g., LSMS, AGRIS, 50x2030, annual agricultural surveys, etc.). Currently, such surveys maintain separate systems for syncing plot shapefiles (e.g., OneDrive, Google Drive, DropBox, FTP, etc.) and for confirming that files are collected where needed (e.g., scripts to extract plot IDs from file names and check that each plot has a shapefile).

Surveys

A few surveys currently underway:

  • AGRIS Senegal
  • Feed the Future Baseline Survey in Senegal

A few surveys on the immediate horizon:

  • Nigeria General Household Survey
  • National Living Standards Survey
  • Malawi Integrated Household Survey

Alternatives

The current workaround is:

  • Create a system for syncing files (e.g., OneDrive, Google Drive, DropBox, FTP, etc.)
  • Have supervisors collect files from GPS devices, post them to cloud folders, an sync

To ensure that a plot outline is captured and properly named for each plot

  • Interviewers follow a file name format that contains a key for linking the file to a household/plot
  • Headquarters uses a script/manual checking to:
    • Extract the key from the file name
    • Confirm that each plot can be matched to each shapefile and that each shapefile can be matched to a plot
  • Headquarters manually resolves any linkage problems by reverting to field teams and/or making changes in file names

Needless to say, this is a messy and burdensome process.

Market examples

Compatibility

This will only impact users who elect to use this question type going forward.

Limits

The development team may want to introduce limits on the size or type of file to be uploaded.

4 Likes

Good idea Arthur! This would have been very useful (and time saving) in a couple of surveys.

1 Like

Yes, good idea indeed.

A couple of years ago a survey was run in Bulgaria where this would have been useful for digitizing archive materials. This was dealt in practice by a similar way as what your workaround describes: interviewers uploading files to a shared folder and including the link to a text question of the interview.

Also, recently there was a web survey where PDF documents needed to be acquired, which should be a common use for any attachments in web surveys. Such feature would have come useful since it turned out that not all the documents were already published on the web and could be referred to by links. As a workaround it was suggested to mention availability of the document attachment in the interview, and subsequently the respondents were asked to send them as attachments to a specially designated mailbox, from which they were later retrieved by the team conducting the survey. This works relatively straightforward when there is a low number of the respondents and even fewer of them actually needed to send attachments.

On the other hand the above mentioned scenario of uploading GPS measurements is puzzling and sounds like a complication. Why not making the GPS measurements with the tablet itself? Why involve an external GPS?

Use of handheld GPS devices instead of Survey Solutions’ geography question is several fold:

  1. Survey protocol calls for walking the perimeter of the plot. For the moment, the use of handheld GPS devices is the only measurement method that follows this protocol.
  2. Concerns that interviewers may trace plot outlines without actually visiting the plots or practically identifying the boundaries. For the overwhelming majority of plots that can be reached, this is a point of concern. For plots that are too far/too difficult to be reached, this feature of the geography question is actually a potential advantage, allowing an estimate of remote plots where non-response would otherwise prevail.
  3. Concerns about ability of interviewer/respondent to identify boundaries from satellite imagery. Without practical experience–and particularly experience in a given context–it is unclear whether interviewers and respondents, working together, can actually successfully identify plot boundaries. To the extent that interviewers and respondents are familiar with a map view of their surroundings, this might work. To the extent that either party is not familiar/comfortable with this view of the world, this might not work reliably. But this is ultimately an empirical question that requires testing, if not experimentation.
  4. Concerns about recency of imagery. If imagery is outdated, plots may not be visible (if former forest recently cleared for agriculture) or their boundaries may be wrong (if plot boundaries recently expanded or boundaries extend into river basin during dry season but imagery during wet season).

Wouldn’t that be the feature that is really required?? (start and accumulate GPS locations until stopped)

If Survey Solutions could fully replicate the behavior and precision of a survey-grade GPS device, yes, for the agricultural survey use case. But the tablet/Survey Solutions fall short of this need. Consumer-grade tablets do not have onboard GPS that is as accurate as a handheld GPS (and using a Bluetooth-enabled GPS devices as the location service is possible but untested). Survey Solutions does not allow the accumulation of GPS locations at a fixed interval until the user indicates to stop.

But I still think this question type beyond the agricultural survey use case.

General comment about builtin feature vs ‘blind’ file upload, if you need to make sense/validate/use data from the answer (for example, calculate plot size, or check that it is indeed in the location that you expect) then interview process somehow needs to ‘know’ how to interpret uploaded contents, so you may need to still implement some kind of custom reader for each type of the input. But if its simply for collecting random files (scanning documents, pdf files etc) then general file upload may suffice

Using the tablet GPS for area measurements, e.g. in Agricultural Censuses, is not an option. The accuracy is far below Garmin handhelds with WAAS.
There is also an increasing number of surveys collecting data from other sensors, like temperature and humidity sensors installed in dwellings in African countries to measure the impact of global climate change.
I definitely second @arthurshaw2002’s request for collecting files as interview data.
The ability to acquire simple ASCII files would go a long way towards collecting sensor data like .GPX files.

1 Like

Hi,
Actually, I need this type of question to upload file like (resume, cover letter, certificat, …).

Have you and idea to do it ?

To my knowledge, there’s no way to do this without a new feature being added to Survey Solutions’ codebase.

That said, here’s a kludgy workaround that springs to mind. Your mileage may vary.

  • Take a picture of the file’s contents (e.g., open file on PC, take a picture of opened file with tablet)
  • Use some tool for extracting text from the image (e.g., Tesseract or bindings to it like in R
1 Like

Thank you, I’ll try it