Export is uploading to Onedrive instead of defaut folder

Hi, I am getting a issue.

While trying to start export job from python API, the destination is showing as Onedrive. I have not even enabled the cloud storage.

This is the syntax I tried:

job = ExportJob(questionnaire_id=“fnewnoiwfnwiof$1”, export_type=“SPSS”, interview_status=‘All’, from_date=None, to_date=None, translation_id=None)
r = ExportApi(client).start(job, wait=True,show_progress=True)

Hello @yubraj ,

see here a change in the SSAW python client that may be responsible for this behavior (change done about 4 months ago):

It looks to me that any export job you create with SSAW will default to a OneDrive destination, unless you change that explicitly.

You may want to confirm if the problem vanishes after specifying storage_type=None in your job creation, and (if the export succeeds as download) raise an issue here to have that rectified.

Best, Sergiy Radyakin

Thanks Sergiy,

storage_type=None worked, I tried export_path=None earlier but storage_type never came up on my mind.

Just pushed out new 0.9 version. Should fix this error together with other optional parameters that needed to be explicitly set as none. Hopefully, not any more.

1 Like

Thank you very much, @zurab !!

Could you please clarify (with regards to the underlined in the screenshot below), how will the from=... and to=... parameters be handled (or ignored)?

For example, if an export data file with from=Monday and to=Tuesday is present (possibly generated manually or with other tools), and a file matching the underlined properties is requested on Friday through this ssaw code without any from= and to= through this code and default (None) value for the limit_age parameter.

And in the case that several such matching exports are available, how will it decide which one to take (oldest, newest, in order of some guid, random, etc). Note that multiple such files differing by from-to specification may be stored simultaneously by the server (see second screenshot below).

Thank you!


Good question. So as it is currently implemented in the code, the first available export that satisfies all filters (qid, status and type) and then the cut-off date (age is converted to date by now-age) will be selected. This logic expects that the export list is returned in the new-to-old order. So if the order is anything different (which I haven’t tested recently :() then the output may not be the most recent artifact.

Hi @zurab ,

I updated ssaw to version 0.9. The previously required optional parameter issue has been resolved — thank you for that.

However, the wait=True issue during export generation still persists like I have mentioned here

In version 0.8, it waited for approximately 20 seconds before proceeding. In version 0.9, it now waits around 300 seconds, but this is still not sufficient for larger projects like mine, where generating the export file takes approximately 15 minutes.

As a result, the script proceeds before the export file is fully generated, leading to the previous export being downloaded instead of the newly requested one.

in ssaw 0.7 which I am currently using doesnt have this problem.

There is a new -timeout- parameter for start() method now. Indeed the default is 300 but you can override it to as high number as you need. There is a risk that there may be a different error (in the client code, server returning 500, or other error etc) and you don’t want your code to hung indefinitely, so explicitly putting the timeout parameter would be much cleaner approach.

As it is currently implemented, get(.. generate=True) doesn’t take the timeout parameter to pass down to start() call, so you’d explicitly call start yourself and then export.get() to download the created archive.

I just thought about the last part of your comment “As a result, the script proceeds before the export file is fully generated, leading to the previous export being downloaded instead of the newly requested one.”, how are you getting the export file? are you retaining the export job (r variable in your above code) created/returned by the start() call, or you’re just calling export.get() afresh? you should use that export_job that got started, otherwise separate get() will just return any available (so the older) file. Can you share the part of your code that follows the start() call?

This is what I am doing now

job = ExportJob(questionnaire_id=“5a99cf9de0e24e4f9da98af998129868$1”, export_type=“SPSS”, interview_status=‘All’)
r = ExportApi(client).start(job, wait=True,show_progress=True)

filename = ExportApi(client).get(export_type=“SPSS”,questionnaire_identity=“5a99cf9de0e24e4f9da98af998129868$1”, interview_status=‘All’)

basically if export takes longer than 300 seconds than “filename” will download the previously generated file.

Ah, your filename = … line has no knowledge about the existing job you just started a few lines up, export.get() call just request an existing archive that setisfies the filters. If the above job is still running, then the below code will return another, completed result from earlier… which you don’t want of course.

Quick solution options - since you know your export length by now, just add large enough timeout parameter to the start call so that it waits for the completion.

Optionally, add a check/loop that the job you started has finished :

job =ExportApi(client).get_info(r.job_id)
while job.export_status == "Running":
    time.sleep(5)
    job = ExportApi(client).get_info(job.job_id)

# to be super extra sure, check for the status as the loop above may exit
# if the job failed or got canceled
if job.export_status != "Completed":
  # interrupt your script or otherwise process the error
  # instead of requesting a download

Depending on where you use this code, one or another approach may be better - if you are writing some kind of separate app/dashboard etc it would not be the best to execute such long-running process in the main thread an thus have your application frozen and waiting, instead would be better to have the export download be done in the background. But if you’re ok waiting, above approach would work.

In the next update I will expose the timeout parameter in exportapi.get() call as well so instead of running a separate start() and then get() you can instead use filename =get(…, generate=True, timeout=large_number).

1 Like

Thank you Zurab, that was helpful.

during next update can you also please fix this issue as well.

seems like AssignmentsApi(client).get_list() is now giving accurate assignment info in version 0.9 but UsersApi(client).get_list() is still giving duplicated username.

Thank you.