Compute

The targon.Compute helper centralizes the available compute tiers for your functions. Pass these constants to @app.function(resource=...) to select the hardware class used for remote execution.

from targon import Compute

@app.function(resource=Compute.CPU_SMALL)
def hello():
    ...

CPU Tiers

Constant	Description	Typical Workloads
`Compute.CPU_SMALL`	Cost-efficient shared vCPU.	Background jobs, web hooks, request routing.
`Compute.CPU_MEDIUM`	More vCPU and memory.	Data processing, lightweight AI workloads.
`Compute.CPU_LARGE`	High-concurrency CPU tier.	Batch processing, API backends with steady load.
`Compute.CPU_XL`	Maximum CPU tier.	Heavy CPU-bound workloads or large concurrency.

GPU Tiers (H200)

Constant	Description	Typical Workloads
`Compute.H200_SMALL`	Single H200 slice.	Prompt serving, small LLMs, diffusion warmups.
`Compute.H200_MEDIUM`	More GPU memory and CUDA cores.	Mid-sized model inference, fine-tuning jobs.
`Compute.H200_LARGE`	Large GPU allocation.	High-throughput inference, multi-modal workloads.
`Compute.H200_XL`	Maximum GPU capacity.	Training loops, multi-billion parameter models.

All constants are defined in the SDK:

# targon-sdk/src/targon/core/resources.py
class Compute:
    CPU_SMALL = "cpu-small"
    CPU_MEDIUM = "cpu-medium"
    CPU_LARGE = "cpu-large"
    CPU_XL = "cpu-xl"
    H200_SMALL = "h200-small"
    H200_MEDIUM = "h200-medium"
    H200_LARGE = "h200-large"
    H200_XL = "h200-xl"

Selecting Resources

Choose a CPU tier when your function is CPU-bound or primarily executes synchronous Python code.
Choose a GPU tier when your function runs accelerated workloads (PyTorch, TensorRT, vLLM, etc.).
Combine resource selection with auto-scaling settings (min_replicas / max_replicas) to balance performance and cost.

See the Compute Resources guide for in-depth recommendations.

CPU Tiers​

GPU Tiers (H200)​

Selecting Resources​

CPU Tiers

GPU Tiers (H200)

Selecting Resources