Error using API interface

Hello,

I am having troubles downloading data from the server using the API interface since the release of the new version. The following code was working fine before the release:

query <- sprintf("%s/api/v1/export/%s/%s/%s", hq_address,
ex_type, qqn_id, action)

#attempt 5 times to generate questionnaire
for (i in 1:5) {
data <- POST(query, authenticate(user, password)) #successful
Sys.sleep(1)
}

#wait 15 seconds before starting the download to let the data generate
Sys.sleep(20)

#check status of the download
message(paste(“status of data generation:”, data$status_code))
message(qqn_id)
#str(data$date) #date of last data creation

#export the data

query <- sprintf("%s/api/v1/export/%s/%s",
hq_address, ex_type, qqn_id)

data <- GET(query, authenticate(user, password))

Andres,

  1. what language is this?
  2. what is the code supposed to be doing?
  3. which error are you getting? after which command?
    Sergiy

Hello Sergiy,

  1. The language is R

  2. The code is supposed to download version 1 of the following questionnaire : “https://muva2.mysurvey.solutions/api/v1/export/tabular/ad3675f0-6f4e-4068-9ce9-1d1fa1de42ff$1

2.1 I am first trying to POST a request to the server to generate the data:

POST(query, authenticate(user, password)) where query = “https://muva2.mysurvey.solutions/api/v1/export/tabular/ad3675f0-6f4e-4068-9ce9-1d1fa1de42ff$1/start

  1. After POST, I am getting error 400

  2. If POST was successful, I would then download the data using GET. However, I am not able to reach this step for the moment.

Andres,
as per API documentation here
https://demo.mysurvey.solutions/apidocs/index#!/Export/Export_StartProcess
error 400 means the questionnaire ID is malformed.

As per documentation here:
https://support.mysurvey.solutions/headquarters/api/api-for-data-export/
the questionnaire ID (GUID) should be specified without the dashes.

Once you remove the dashes everything works fine. Nothing in the above was affected by the recent release, so while your code may have been working before, the inputs that you supplied to your code were probably correct before and wrong now.

Best, Sergiy

Sergiy,

Thanks for your quick reply. I have removed the dashes from the questionnaire ID. As a result we are now getting a successful 200 status while attempting to re-create the export file. However,a 400 status is received when downloading is attempted. We are wondering if it has anything to do with the default encoding and I if we should supply a different encoding than UTF-8?

These are the steps and outputs when attempting the download protocol in R:

#1. requesting the re-creation of export file

query<- "https://muva2.mysurvey.solutions/api/v1/export/tabular/ad3675f06f4e40689ce91d1fa1de42ff$1/start" 
  data <- POST(query, authenticate(user, password)) 
>  data 
Response [https://muva2.mysurvey.solutions/api/v1/export/stata/ad3675f06f4e40689ce91d1fa1de42ff$1/start]
  Date: 2018-12-17 12:37
  Status: 200
  Content-Type: <unknown>
<EMPTY BODY>

> content(data)
raw(0)  

2 downloading

  querydownload <- "https://muva2.mysurvey.solutions/api/v1/export/tabular/ad3675f06f4e40689ce91d1fa1de42ff$1"
  
 datadownload <- GET(querydownload, authenticate(user, password)) #succesful
 > datadownload
Response [https://decdatabackup.s3.amazonaws.com/export/export/a77b7dca-b393-4caa-bf03-7df32cb2410c/ad3675f06f4e40689ce91d1fa1de42ff$1_Tabular_All.zip?AWSAccessKeyId=ASIA4KSZDRBKBCUN72YC&Expires=1545086496&response-content-disposition=attachment%3B%20filename%20%3D%22Aprender_1_Tabular_All.zip%22&x-amz-security-token=FQoGZXIvYXdzEA0aDDWnEfMNwG4djKAjZyK3A7k2fYKemx2yx5l61yokxSHaP9cbN4RN3FScBF28rT00Ezl5nVB%2FDkCWX8XXJlZTCMqeBZS3J7tTjMIch7r36RlXEuXw6SQWt%2BVDxJxFRN5rFgRz2ME4x5SzO9JC87pyTRYkn9gPpQJvyq0lii0uTeeWgNdeJIA1t2Rs7kcHL7muA0Q543u3ARdQXVBnzsAjaNJKUZ1TJeNuqGO7j3QHOIk3Hq%2FhBfbVIoZVvkF%2Fv1NFFhPEkaHIqaUQOFEltRFSnKVcXvXhfkGc6%2B5%2FObVi9lrwuThUu54AAUMFYdsGUgEqWbKFUnwrji%2FckjOk6Y0CN3Jz84oU77Iy7dHzavZEZS9oG6Gq%2FxRkOY4qbSvTtnQNe7ABZJVdiytN5imMCGlog8aMCnTeGabFP4RRt2%2B8QrkyFn4DvqXs4qj657wgM3CxNJv9%2B6Ogo5qeKlcksLOo86y1sSL2%2BpCrIrNrADy1ChB2xzzpzydGLgTb2Slwz1dmh6PmrbZ5Fdhd6d2mjbkJXWlUoQYCWmx%2Fydx%2FYNrp3cLf80aacpxHKXK6p1rQnzq1PWiMec96JYX2vrezFzNBuZw%2F1nst8Dwoi53e4AU%3D&Signature=KL%2BlCAFWUEeC%2B6eRaYhV%2FKMa4BY%3D]
  Date: 2018-12-17 12:41
  Status: 400
  Content-Type: application/xml
  Size: 499 B
<BINARY BODY>

#3. Inspect details of downloading status

  querydetails <- "https://muva2.mysurvey.solutions/api/v1/export/tabular/ad3675f06f4e40689ce91d1fa1de42ff$1/details"
  details <- GET(querydetails, authenticate(user, password)) #succesful
  content(details)

> content(details)
$`HasExportedFile`
[1] TRUE

$LastUpdateDate
[1] "2018-12-17T12:19:27Z"

$ExportStatus
[1] "NotStarted"

$RunningProcess
NULL

@aarau,

The issue is that url you are hitting with get() doesn’t return file but redirects instead to another location (notice decdatabackup.s3.amazonaws.com … url). This technical change was introduced some time ago so I guess we should’ve tried to document it better.

url redirects are ‘regular’ part of http standards and ‘most’ of web clients, being it web browsers, curl.exe etc have ability to seamlessly follow them but get() function is a simple -copy- object procedure that doesn’t.

You can try to use getURL() instead and specify .opts=curlOptions(followlocation=TRUE) parameter for example.

Hi Zurab,

I am also having this issue. Would you be able to provide more details about how to use getURL instead of GET, potentially using Andres’ code as an example?

Thanks!

Thank you @Zurab! The solution is working fine. I am just wondering if there’s a more straight forward way to get the redirected url. I am sharing my code so @peterplan can take a look and also to see if someone can help me in making my code more efficient. i.e. getting the redirected url more elgantely.

This is my code:

#DOWNLOAD DATA FROM SERVER SOLUTIONS' SERVER USING THE REDIRECT URL

library(httr)
library(plyr); library(dplyr)
library(haven)
library (stringr)
library(lubridate)
library(leaflet)
library(readstata13)
library(foreign)
library(gtools)
library(tidyverse)
library(ggvis)
library(qwraps2)
library(jsonlite)


## SET PARAMETERS ------

hq_address <- "https://myserver.mysurvey.solutions"

ex_type <- "tabular" #format of your data

user <- "mysuser"  #API user in server
password <- "mypassword" #API password in server 

#path and folder where the .zip file will be stored
download_folder <- file.path("C:\\Users\\aarau\\Dropbox (OPML)\\01. Andres\\02. Tutorials\\R")
tounzip <- "mydata.zip" 


#I am going to check the connection to the server and the questionnaires we have there: ----
#1 check list of questionnaires in server and their IDs  (SUCCESSFUL!) 

queryInfo <- sprintf("%s/api/v1/questionnaires",hq_address) #query to get the info


InfoServer <- GET(queryInfo, authenticate(user, password)) #successful
http_type(InfoServer) #check format of content (it is in Json)
fromJSON(content(InfoServer, as = "text")) #read content in text format



#extract elements from the InfoSever (Title, Version, ID, etc)
ListOfQn.df <- content(InfoServer)$Questionnaires %>% bind_rows %>% select(Title, Version, QuestionnaireIdentity)


#Fetch data (ID) of questionnaire you want download (Version 2 of Aprender, in this case)
myquestionnare <- ListOfQn.df %>% filter(Title == "Aprender", Version == 2)
qn_Id <- myquestionnare$QuestionnaireIdentity

## Now I am going to request the server to generate the data -----
#2 requesting the generation of export file (succesful!)

queryGenerate <- sprintf("%s/api/v1/export/%s/%s/start", hq_address,
                         ex_type,qn_Id)
Datagenerated <- POST(queryGenerate, authenticate(user, password))  #(Sucess)
Sys.sleep(30)

##Now I am going to actually download the data
#3 Download data


queryDownload <- sprintf("%s/api/v1/export/tabular/%s", hq_address, qn_Id)

#Attempt 1
dataDownload <- GET(queryDownload, authenticate(user, password), user_agent("andres.arau@opml.co.uk")) #fail 
Sys.sleep(30) # this download fails because URL is re-directed
          
#Attempt 2, Fetch using  redirected URL 

redirectURL <- dataDownload$url 

HERE IS WHERE I WOULD LIKE TO GET THE REDIRECTED URL USING FEWER STEPS. IS IT POSSIBLE? (THIS CODE WORKS BUT I AM NOT SURE IF IT IS THE BEST WAY TO DO IT)

#Download the data uisng the redirected url
RawData <- GET(redirectURL) #Sucess!!


##Now I am going to save the .zip data in my computer (local file)-----
#open connection to write data in download folder

filecon <- file(file.path(download_folder, tounzip), "wb") 
#write data contents to download file!! change unzip folder to temporary file when in shiny
writeBin(RawData$content, filecon) 
#close the connection
close(filecon)

#Success!!!
1 Like

Thanks @aarau! Appreciate you sharing your code.

Also, here are some SuSo download API tools for R that are a work in progress. I wrote some inelegant code (like above). @l2nguyen broke it down into a set of functions. Some functions cover discrete tasks, like fetching the list of questionnaire IDs. Others are handle a chain of actions, like downloading data for all questionnaires contain a sub-string or match a regex pattern.

No unit test coverage yet in the repo (e.g., check that API still serves what the functions expect). But a version of this code was working beautifully as recently as end of December 2018.

@peterplan @aarau I updated the code for those functions to handle the redirect now so it should work like previously. However, it does use the somewhat inelegant solution that you guys were using because a quick google did not turn up anything better. The solution that @arthurshaw2002 presents does look better.

@peterplan @aarau @l2nguyen

First, the TD;LR version: Andres’ and Lena’s two-step trick for following the redirect seems the best R solution for now, but there are some leads for a one-step solution.

Now the “too interested; kept reading” version.

Following @zurab 's lead, I started researching whether any R interfaces for curl that handle redirects well, and downloading files on the other end of the redirect

While I don’t (yet) have a final answer, here are a few leads:

  • httr allows the user to set curl options via config. These options include followlocation for following a redirect. But, as Zurab pointed out, this won’t work for GET(). And I couldn’t quickly find any httr interface for getURL methods.
  • RCurl offers some a curl interface that is less user-friendly than httr but allows the user to get “closer to the metal” (see here). The getURL function seems promising. It is the R equivalent of non-R tool by the same name, and it offers the followlocation option (among many other curl options). The getBinaryURL is a convenience function helps handling file download, but that relies on all getURL and thus gives access to the curl followlocation option. But after tinkering with it a bit and couldn’t figure out how to make it work. Still, it seemed the most promising lead for an elegant solution–and one that may handle more general cases than SuSo’s single-redirect world.

@peterplan @aarau @l2nguyen

Over the weekend, I did some digging and experimenting. I found a few one-step solutions.

With httr

# -----------------------------------------------------------------------------
# solution with httr
# -----------------------------------------------------------------------------

library(httr)

downloadData <- GET(
	downloadEndpoint, 
	accept_json(), 
	authenticate(login, password),
	write_disk(paste0(downloadDir, "test_getHTTR.zip"), overwrite = TRUE),
	progress(),
	config( 					# curl configuration options
		followlocation = 1L, 		# follow redirects 
		unrestricted_auth = 0L 		# but do not pass authentication to redirects
) 

It turns out that httr has tons of curl configuration options available to users (read more here). Among those are two that, combined together, help us out:

  1. followlocation , when set to true, ensures that httr follows redirects
  2. unrestricted_auth, when set to false, makes sure that redirects do not get the initial authentication

The first option may not be needed. It seems like this is the httr default behavior. The second option, however, seems to be necessary. It seems like httr, when following redirects by default, tries to take the redirect address and add authentication to the url. But the redirect page for SuSo’s export endpoint does not expect authentication. Adding authentication is tantamount to getting the address wrong.

With RCurl

# -----------------------------------------------------------------------------
# solution with RCurl
# -----------------------------------------------------------------------------

library(RCurl)

# fetch the file, reading it into memory
downloadData <- getBinaryURL(
	downloadEndpoint, 
	userpw = paste0(login, ":", password), 
	.opts = list(followlocation = TRUE))

# write the file to disk
writeBin(downloadData, con = paste0(downloadDir, "test_getBinaryURL.zip")) 

Following Zurab’s suggestion, one can also use getBinaryURL, which is a convenience function for getURL. Two things are needed. First, one needs to specify the followlocation option as true, as with httr. Second, one needs to use writeBin to write the binary file to disk.

Which solution should you use? I prefer the httr solution for a few reasons. First, I think that httr handles writing the file to disk better than getBinaryURL. From what I understand, the getBinaryURL stores the full binary file in memory as a raw vector before writing it to disk in writeBin, while httr's write_disk option writes progressively (see more here). For small files, the difference is negligible. For larger files–for example, SuSo images or paradata–the difference may matter. Second, I find httr syntax more readable–perhaps because the former is tidyverse and the latter base R-esque. Third, in Hadley–and his more used and tested packages–I trust.

Just tacking on to this thread to say that I was having all of the same issues with utilizing the API in R as @aarau until stumbling upon @l2nguyen’s SuSoAPI Github repo.

First, thanks, Lena!

Second, I had particular trouble with two non-obvious details:

  1. the need to remove dashes in the questionnaire identity
  2. the utilization of curl options to follow the redirects in the GET() request

It would be great for the Survey Solutions API documentation to emphasize the dash-less questionnaire identifier, and for an R-specific page to exist in the official documentation to prevent folks from having to scroll through the forums to find detail #2.

Thanks to all of you for your contributions to this excellent survey platform and community.

1 Like

Bonsoir. Je veux faire l’exportation des données mais cela ne marche et voici le message d’erreur. Merci de m’aider. Je n’utilise pas un API mais juste faire une exportation simple. Merci pour vos réponses. "Destination: Export file will be avalaible for download

An unexpected error occurred during export. We are sorry for inconvenience. Please contact support team support@mysurvey.solutions with the log files *.log that are stored in the application folder. Additional information to this issue is available at the support page: https://support.mysurvey.solutions/headquarters/config/standalone-server-errors/."
Et quand je vérifie l’etat c’est le message reçu: