Is there a way to download dataset in wideformat instead of multiple rosters?

Hi team,
Apparently Survey CTO allows to download a cleaning do file which will transformed your downloaded data into a unique wide dataset.

This is very useful for agricultural data which have many layers of information (plot level, season level, crop level, etc…). Does SS have also such feature, creating a wide format ready to download?

Best,
Claire

1 Like

Dear Claire, this is a very interesting feature. Could you please show an example of the do-file generated by the Survey CTO for two-three real world questionnaires that you have in Survey Solutions? Thank you, Sergiy

Hi Sergiy,
since my questionnaires are programmed on SS and not Survey CTO, I just have an example of the cleaning do file generated by CTO from a questionaire I don’t have. If you are still interested, how can I share it with you?

You can certainly post it here, but it is not very helpful without knowing what was the questionnaire and actions required to produce it.

  1. Take the following example:

  2. Create a corresponding questionnaire template in Survey CTO.

  3. Export the data.

  4. What is the structure of the output file? What is the additional script that SurveyCTO has generated? What can be/has to be configured by the user in this process?

Let’s also put some bounds on our data structure:

  • N persons in HH: 25
  • N job contracts per person: 10
  • N offices per contract 3
  • N plots per HH: 25
  • N crops per plot: 5

Crops are identified by 9 digit IDs, for example POTATO=100122700, while plots, persons, contracts and offices are numbered sequentially/automatically.

Expect 5…500 attributes at every level of data (persons, plots, etc).

Hi Sergyi,

I’d like to contribute to this but unfortunately I never used Survey CTO.
I’ve just heard of this possibility form a user and was curious to know if SS proposes the same.

Maybe other Survey CTO users could explain you better and give you concrete examples. The do files I have is a 4,000 -line code and since I don’t have the corresponding questionnaire, it might not help much, or maybe just to get an idea.

Best

Just curious: why is wide format desirable? How does it make data checking easier?

Do you want to creating summary info for nested levels?

Example: compare plot planting pattern with number of crops planted on the plot

  • Open crop file
  • Create a plot-level file with the count of crops per plot
  • Merge plot file
  • Check whether reported cropping pattern (e.g., purestand) matches with number of crops (e.g., 2)

A wide format would allow me to have all my data in one unique level of observation (household) and I could run data quality checks codes using this unique datasets. I have in mind the IPA check do files that uses as input a unique dataset. I won’t have this way to reformat every roster datasets coming out from SS into HH level.

There is no configuration for the export format. But you can easily transform the data using your statistical package, for example, the relevant command in Stata is reshape.

For the moment, Survey Solutions exports data in long format only. To have data in wide format, the end user would need to do some data manipulation.

That said, it would be good to to hear:

  • Whether other users have similar needs/desires
  • What frequently used tools require wide format data as an input. To that end, could you please point us to the IPA check do files that you mention?