Solved

Cannot read file in Python from Export API due to decoding issue

  • 10 February 2023
  • 1 reply
  • 325 views

 

Greetings, 

You can see a block of code to get the data from EXPORT API via Python, above. Response’s status code is “200”.

I want to open the file in Python to use it via Pandas and make some manipulations. However, I cannot read it. I have used the code below but it created a file in my Desktop: 

z = zipfile.ZipFile(io.BytesIO(res.content))

z.extractall()

 

I simply used the pd.read_json() as in below:

pd.read_json(z.open(f'{z.namelist()[-1]}'))

 

But it returned an error below:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

 

How can I solve this problem? 

icon

Best answer by Saish Redkar 13 February 2023, 19:17

View original

1 reply

Userlevel 7
Badge +10

Looks like a decoding error while trying to read in the zipped file into pandas.

Keep in mind that the extract generates a .json.gz in a directory , which needs to be further unzipped to a <name>.json file. Is the extractall() part giving you the final json or the json.gz file?


I was able to read the end .json file generated from your code into my dataframe using

data_df = pd.read_json('your_filepath.json', lines=True)

 

Reply