Skip to content

CLI Reference

The Titan CLI is a Java interactive shell that connects directly to the Master node over TITAN_PROTO. It lets you query job state, stream logs, manage the cluster, and approve or reject HITL gates — all without opening the dashboard.


Starting the CLI

java -cp target/titan-orchestrator-1.0-SNAPSHOT.jar titan.TitanCLI

Connects to localhost:9090 by default. Type help at the prompt to see all commands.

==========================================
    [INFO] TITAN DISTRIBUTED ORCHESTRATOR
==========================================
Connected to: localhost:9090

titan>

Command Reference

Cluster

Command Description
stats Cluster health: active workers, queue depths, worker load and capability tags
json Same as stats but returns raw JSON (useful for scripting)

Job Lifecycle

Command Description
status <job_id> Get current state of a job: PENDING, RUNNING, COMPLETED, FAILED, CANCELLED
logs <job_id> Fetch stdout/stderr for any job
cancel <job_id> Cancel a running or queued job. Cascades cancellation to waiting children.
run <filename> [requirement] Execute a script from perm_files on the cluster. Optionally specify GPU or HIGH_MEM.

The DAG- prefix is optional — status verify-job and status DAG-verify-job are equivalent.

Pipelines

Command Description
dag <dag_string> Submit a raw DAG string directly (advanced use)

Services & Workers

Command Description
deploy <filename> [port] [requirement] Deploy a long-running service from perm_files. Examples below.
stop <service_id> Stop a running service
shutdown <host> <port> Remotely decommission a worker node
# Deploy a Python service on port 9991
titan> deploy log_viewer.py 9991

# Deploy a Worker JAR with GPU capability on port 8082
titan> deploy Worker.jar 8082 GPU

Files

Command Description
upload <local_path> Upload a local file to perm_files on the Master

HITL Gates

Command Description
approve <gate_id> Approve a paused HITL gate. Downstream jobs resume.
reject <gate_id> Reject a HITL gate. Downstream jobs are cancelled.

Gate IDs follow the pattern hitl-gate-{job_id}, auto-generated by the SDK when you set hitl_message on a TitanJob.

titan> status hitl-gate-preprocess
RUNNING

titan> approve hitl-gate-preprocess
OK

titan> status train
COMPLETED

TitanStore

Command Description
store get <key> Read a value from TitanStore
store set <key> <value> Write a value to TitanStore

Useful for inspecting shared agent state, debugging cross-job KV handoffs, or manually seeding values.

titan> store set debug:run:001 started
OK

titan> store get debug:run:001
started

Reading stats Output

--- TITAN SYSTEM MONITOR ---
Active Workers:    2
Execution Queue:   0 jobs
Delayed (Time):    0 jobs
Blocked (DAG):     1 jobs
Dead Letter (DLQ): 0 jobs
-------------------------------
Worker Status:
 • [8080] Load: 2/4 (50%)    | Skills: [GENERAL]
    └── TSK-log_viewer.py
 • [8081] Load: 0/4 (0%)     | Skills: [GPU]
Field What it means
Execution Queue Jobs fully unblocked, waiting for a worker slot. High and rising means the cluster needs more workers.
Blocked (DAG) Jobs waiting on parent dependencies. Normal during pipeline execution.
Delayed (Time) Jobs scheduled for future execution via delay_seconds.
Dead Letter (DLQ) Jobs that exhausted retries. Inspect with logs <job_id>.
Load (2/4 50%) Worker is running 2 concurrent jobs against a thread capacity of 4.
Skills Capability tags on that worker. Only jobs with a matching requirement will be routed here.
TSK- prefix Long-running deployed service (API, agent, persistent worker).
WRK- prefix Dynamically spawned child worker process (reactive auto-scaling).

Ephemeral scripts don't appear by name. Their execution is reflected in the load percentage.


Notes

  • The CLI opens a fresh TCP connection to the Master for each command.
  • YAML pipeline submission is not supported in the Java CLI. Use the Python SDK: python titan_sdk/titan_cli.py deploy pipeline.yaml
  • The submit command (raw skill dispatch) is also available for low-level job submission: submit <skill> <data>