Skip to main content

Compute

The targon.Compute helper centralizes the available compute tiers for your functions. Pass these constants to @app.function(resource=...) to select the hardware class used for remote execution.

from targon import Compute

@app.function(resource=Compute.CPU_SMALL)
def hello():
...

CPU Tiers

ConstantDescriptionTypical Workloads
Compute.CPU_SMALLCost-efficient shared vCPU.Background jobs, web hooks, request routing.
Compute.CPU_MEDIUMMore vCPU and memory.Data processing, lightweight AI workloads.
Compute.CPU_LARGEHigh-concurrency CPU tier.Batch processing, API backends with steady load.
Compute.CPU_XLMaximum CPU tier.Heavy CPU-bound workloads or large concurrency.

GPU Tiers (H200)

ConstantDescriptionTypical Workloads
Compute.H200_SMALLSingle H200 slice.Prompt serving, small LLMs, diffusion warmups.
Compute.H200_MEDIUMMore GPU memory and CUDA cores.Mid-sized model inference, fine-tuning jobs.
Compute.H200_LARGELarge GPU allocation.High-throughput inference, multi-modal workloads.
Compute.H200_XLMaximum GPU capacity.Training loops, multi-billion parameter models.

All constants are defined in the SDK:

# targon-sdk/src/targon/core/resources.py
class Compute:
CPU_SMALL = "cpu-small"
CPU_MEDIUM = "cpu-medium"
CPU_LARGE = "cpu-large"
CPU_XL = "cpu-xl"
H200_SMALL = "h200-small"
H200_MEDIUM = "h200-medium"
H200_LARGE = "h200-large"
H200_XL = "h200-xl"

Selecting Resources

  • Choose a CPU tier when your function is CPU-bound or primarily executes synchronous Python code.
  • Choose a GPU tier when your function runs accelerated workloads (PyTorch, TensorRT, vLLM, etc.).
  • Combine resource selection with auto-scaling settings (min_replicas / max_replicas) to balance performance and cost.

See the Compute Resources guide for in-depth recommendations.