Getting export data using Python

Is there any way in python that I can get all the datas collected directly in python pandas dataframe?
what I am doing now is generate the dataset and downloading the dataset by running:

clientA = ssaw.Client(url=‘server website’, api_user=‘expapi’, api_password=‘password’, workspace=‘asp’)
from ssaw import ExportApi
filename = ExportApi(clientA).get(export_type=“SPSS”, questionnaire_identity=“31e56faf-4d2b-4eb8-ab2f-698948720f98$1”)

this code downloads the latest dataset generated. and after this I run:

df = pd.read_spss(‘filename.sav’) #from pyreadstat and pandas module

so by using this, Latest generated file is stored in pandas dataframe and I can do analysis from there.
It would be so much easier if I can get the collected data directly in pandas dataframe without having to generate and download the data.

There is no ‘native’ file storage format for pandas so when getting/storing data you still have to use either file-based storage (csv, stata, spss, sqlite etc) or another sql database (postgres, mysql etc).

df = pd.read_spss(...) command reads data from the file/remote storage and loads into an in-memory dataframe structure. So there is no ‘magic’ unfortunately to have data loaded into pandas directly.

thanks also I didnt find python package to generate the dataset in desired format only downloading the lastest generated file, but I did find it in powershell:

$job=Initialize-SSAWExportJob -QuestionnaireId ‘31e56faf-4d2b-4eb8-ab2f-698948720f98$1’ -ExportType “SPSS” -InterviewStatus “All”
Start-SSAWExportJob $job -Wait | Get-SSAWExportFile

is there any code to generate file from python?

Take a look at GitHub - vavalomi/surveysolutions_utils for example of how to do this.

Basically, you can start an export process if it doesn’t exist, wait for it to be completed and then download the file:

args = {
    "questionnaire_identity": qidentity,
    "export_type": exporttype,
    "interview_status": interviewstatus,
}
response = ExportApi(client).get(**args, export_path = exportpath)

if not response:
    # there is no ready export, start a new job
    job = ExportJob(**args)
    job = ExportApi(client).start(job)

    # check status every 1 second until the status is not 'Completed' 
    job = ExportApi(client).get_info(job.job_id)
    while job.export_status != "Completed":
        time.sleep(1)
        job = ExportApi(client).get_info(job.job_id)

    # the job was completed, try to download if file was created
    if job.has_export_file:
        response = ExportApi(client).get(**args, export_path= exportpath)

getting "name 'qidentity' is not defined" error.

of course, I gave you only the small block of the code, you should define the rest of variables, like:

client = ...
qidentity = "31e56faf-4d2b-4eb8-ab2f-698948720f98$1"
exporttype = "SPSS"
interviewstatus = "All"

or put the strings directly into the args = {} part.

yes I did that,

args = {
“3adfaf-gdfg5-fgfdfdhd”: qidentity,
“All”: exporttype,
“Complete”: interviewstatus,
}

python dictionaries have a syntax of

name = {
 keyname1: value,
 keyname2: value,
 keyname3: value
}

so you should write:

args = {
    "questionnaire_identity": "31e56faf-4d2b-4eb8-ab2f-698948720f98$1",
    "export_type": "SPSS",
    "interview_status": "All"
}