Serverless
This section explains how to run autoscaling apps on Targon: deploy containers that scale from zero with traffic, without provisioning or babysitting dedicated servers.
What is Serverless?
Serverless on Targon is a managed runtime for HTTP workloads and batch-style apps. You supply a container image (or start from a template), define how it listens for traffic, and the platform handles replicas, health checks, and scale-down when idle—similar in spirit to rentals for stateless or horizontally scaled services rather than long-lived SSH machines.
Use it when you want public endpoints, variable traffic, or repeatable deploys from the Dashboard or the Python SDK—without managing the underlying fleet yourself.
Before you begin
To get the most out of these guides, it helps to know:
- Containers: What a Docker image is, which port your app listens on, and optionally how commands map to your Dockerfile
ENTRYPOINT/CMD. - Dashboard: How to sign in and navigate Serverless in the Targon dashboard for template-based deploys.
- SDK / CLI (optional): If you follow the infrastructure-as-code path, skim Getting Started for install and credentials.
Creating Serverless Apps
Deploy autoscaling applications without managing infrastructure. Targon Serverless scales your app from zero to peak traffic automatically, so you only pay for the compute you use.
Quick Start (Dashboard)
The easiest way to launch an app is through the Targon Dashboard.
- Create App: Navigate to Serverless > New App.
- Select Hardware: Pick the CPU or GPU tier that fits your workload.
- Choose a Starting Point:
- Public Templates: Ready-to-use apps pre-configured by Targon.
- Custom: Deploy your own image.
Custom Configuration
If you choose Custom, you will need to provide the Image, Port, and optional Command.
View configuration reference for details on arguments, environment variables, and auto-scaling.
- Image: The Docker image URL (e.g.,
python:3.11orghcr.io/my-org/my-app). - Port: The internal port your app listens on (Targon routes public traffic here).
- Command (Optional): The startup command (e.g.,
python main.py).
Infrastructure as Code (SDK)
For production workflows, define your infrastructure in Python alongside your application code. This ensures your deployment is reproducible and version-controlled.
# targon-sdk/examples/getting_started/getting_started.py
import targon
# Define the environment
app = targon.App("getting-started", image=targon.Image.debian_slim("3.12"))
# Define the endpoint
@app.function()
def hello(message: str) -> dict[str, str]:
return {"message": message, "status": "success"}
Deployment Workflow
- Test Locally:
targon run main.py - Deploy to Cloud:
targon deploy main.py - Manage:
targon app list
Next Steps
- Compute Resources: Compare CPU vs. GPU options.
- Web Endpoints: Learn how to host FastAPI, Flask, or other web servers.
- LLM Deployment: Run large language models serverlessly.
Configuration Options
This reference covers the configuration options available when deploying Custom Serverless Apps via the Dashboard or defining them via the SDK.
Container Settings
When configuring a custom deployment, you define the runtime environment of your serverless app.
Image
The image forms the foundation of your application. Targon supports pulling images from any major public registry, including Docker Hub, GitHub Container Registry (GHCR), and Quay.io. You can also use private registries if you configure authentication credentials.
Examples:
python:3.11(Docker Hub)ghcr.io/my-org/my-app:latest(GHCR)quay.io/coreos/etcd:latest(Quay)
import targon
image = targon.Image.from_registry("python:3.11")
Commands & Arguments
The Command and Arguments settings allow you to override the default startup behavior of your container. The system constructs the final execution string by concatenating the Command with the Arguments: [Command] [Arguments].
If you leave the Command field empty, Targon will default to using the ENTRYPOINT or CMD defined in your Dockerfile.
- Example Web Server:
- Command:
uvicorn - Arguments:
main:app --host 0.0.0.0 --port 8000
- Command:
# Override the container's default command
image = targon.Image.debian_slim().entrypoint(["python", "main.py"])
# Alternatively, run arbitrary commands during build
image = image.run_commands("echo 'Building...'")
Exposed Port
The Exposed Port is the internal network port your application listens on inside the container. Targon routes public internet traffic to this specific port on your container.
It is critical that this value matches the port your application opens at startup.
@app.function()
@targon.web_server(port=8000) # Maps public traffic to port 8000
def server():
import uvicorn
uvicorn.run("main:app", host="0.0.0.0", port=8000)
Advanced Options
Environment Variables
Environment variables allow you to inject configuration and secrets into your application at runtime without changing your code or image. This is the standard way to handle sensitive data like API keys, database connection strings, or feature flags. All sensitive values are stored securely with encryption.
- Examples:
DATABASE_URL,TARGON_API_KEY
image = targon.Image.debian_slim().env({
"DATABASE_URL": "postgres://user:pass@db.example.com:5432/mydb",
"DEBUG": "true"
})
Auto-Scaling Configuration
Targon Serverless dynamically adjusts the number of active instances (replicas) of your application based on incoming traffic. You can control the boundaries and sensitivity of this scaling behavior.
Minimum Replicas determines the baseline availability of your app.
- Set to
0to enable Scale to Zero. This is cost-effective for sporadic workloads. - Set to
1or higher to ensure your application is always ready to respond immediately.
Maximum Replicas sets a hard cap on the number of simultaneous instances. Use this to control costs and prevent runaway scaling during unexpected traffic spikes.
Concurrency Target defines the "busyness" threshold for a single replica. It represents the approximate number of concurrent requests one instance can handle. When traffic exceeds this threshold, Targon launches additional replicas to distribute the load.
@app.function(
min_replicas=0, # Scale to zero enabled
max_replicas=10, # Cap at 10 instances
max_concurrency=100 # Scale up when >100 requests/sec per replica
)
def my_handler():
pass
Related topics
- Web Endpoints — Serving FastAPI, Flask, or similar behind HTTP.
- Compute Resources — Choosing CPU/GPU shapes when you configure hardware tiers.
- LLM Deployment — Patterns for large models that fit a serverless workflow.