what causes an unpickling stack underflow when trying to serialize a succesfully generated SageMaker model

  amazon-sagemaker, pickle, python-3.x

I am currently working on setting up a pipeline in Amazon Sagemaker. For that I set up an xgboost-estimator and trained it on my dataset. The training job runs as expected and the freshly trained model is saved to the specified output bucket. Later I want to reimport the model, which is done by getting the mode.tar.gz from the output bucket, extracting the model and serializing the binary via pickle.

# download the model artifact from AWS S3
!aws s3 cp s3://my-bucket/output/sagemaker-xgboost-2021-09-06-12-19-41-306/output/model.tar.gz .

# opens the downloaded model artifcat and loads it as 'model' variable
model_path = "model.tar.gz"
with tarfile.open(model_path) as tar:
    tar.extractall(path=".")

model = pkl.load(open("xgboost-model", "rb"))

Whenever I try to tun this I receive an unpickling stack underflow:

---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
<ipython-input-9-b88a7424f790> in <module>
     10     tar.extractall(path=".")
     11 
---> 12 model = pkl.load(open("xgboost-model", "rb"))
     13 

UnpicklingError: unpickling stack underflow

So far I retrained the model to see, if the error occurs with a different model file and it does. I also downloaded the model.tar.gz and validated it via gunzip. When extracting the binary file xgboost-model is extracted correctly, I just can’t pickle it. Every occurence of the error I found on stackoverflow points at a damaged file, but this one is generated directly by SageMaker and I do note perform any transformation on it, but extracting it from the model.tar.gz. Reloading a model like this seems to be quite a common use case, referring to the documentation and different tutorials.
Locally I receive the same error with the downloaded file. I tried to step directly into pickle for debugging it but couldn’t make much sense of it. The complete error stack looks like this:

Exception has occurred: UnpicklingError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
unpickling stack underflow
  File "/sagemaker_model.py", line 10, in <module>
    model = pkl.load(open('xgboost-model', 'rb'))
  File "/usr/local/Cellar/[email protected]/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/Cellar/[email protected]/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/usr/local/Cellar/[email protected]/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/usr/local/Cellar/[email protected]/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/Cellar/[email protected]/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,

What could cause this issue and at which step during the process could I apply changes to fix or workaround the problem.

Source: Python-3x Questions

LEAVE A COMMENT