PSA on importing Stata data into R

Public service announcement for other SuSo users who work between R and Stata, use SuSo’s Stata files, and ingest them with haven.

In the period January 13, 2020 and prior, I used the following in injest SuSo’s Stata data into R:

someFolder <- "C:/my/folder/"
someFile <- "some_file.dta"
myData <- haven::read_stata(paste0(someFolder, someFile), encoding = "UTF-8")

On January 14, I tried the same with newly generated data, and got an error message of this form:

Error in df_parse_dta_file(spec, encoding, cols_skip, n_max, skip, name_repair = .name_repair) : 
  Failed to parse C:/my/folder/some_file.dta: Unable to convert string to the requested encoding (invalid byte sequence).

Looking at the haven documentation, I realized that I didn’t need to specify encoding. See discussion of encoding here. When I removed the encoding specification, everything worked fine.

myData <- haven::read_stata(paste0(someFolder, someFile))

I’m not sure what changed. The haven package hasn’t been updated since November 2019. The export files I worked with on Jan 13 were generated by SuSo The export files I worked with on Jan 14 were from the same server, but I’ll have to update later on the SuSo version that generated them.

Please recheck with the newest version. The version 20.01.0 had a Stata export-related bug in it, which has been immediately fixed in v 20.01.1

Dear Sergiy,

Great, I’ve just downloaded the Stata export file. Now all files contain variable names and their labels. So I don’t have to import from .tab-file to .dta-file to get variable labels. Thanks so muchh!!! ^^