CosmicAC Logo

Job configuration reference

Fields you set when you create a GPU Container Job or a Managed Inference Job.

You set these fields when you create a job, either in the web interface or with cosmicac jobs create. The job type determines which fields apply. In non-interactive mode, set each field with the flag in the CLI flag column. For the create flow, see Create a GPU Container Job and Create a Managed Inference Job.

Common fields

These fields apply to every job type.

FieldRequiredCLI flagDescription
Job typeYes--typeThe kind of job to create, GPU Container or Managed Inference.
Job nameYes--nameA name to identify the job.
TagsYes--tagsOne or more labels for the job. The CLI accepts a comma-separated list.
LocationYes--locationWhere the job runs, for example IN. The CLI lists the locations your racks report.

GPU configuration

These fields select the job's hardware.

FieldRequiredCLI flagDescription
GPUYes--gpu-typeThe GPU to use, for example GH100_H100_SXM5_80GB. The CLI lists the GPU types your racks report.
GPU countYes--gpu-countNumber of GPUs.
CUDA / driverYes--driverGPU driver version. Only CUDA 12.9 is supported.

Set the GPU type and count in one flag with --gpu TYPE=COUNT, for example --gpu H100=2. This replaces --gpu-type and --gpu-count.

GPU Container parameters

These fields apply to a GPU Container Job.

FieldRequiredCLI flagDescription
Base OS imageYes--base-imageBase OS image for the container. Only Ubuntu22.04/CUDA12.9 is supported.
Disk (GB)Yes--root-disk-size-gbRoot disk size in GB. One of 250, 500, or 1000.

Managed Inference (vLLM) parameters

These fields apply to a vLLM Managed Inference Job.

FieldRequiredCLI flagDescription
ModelYes--modelHugging Face model ID to serve (Qwen/Qwen3-32B).
Runtime image (CUDA)Yes--runtime-imageServing runtime image (vllm-openai-0.8.5).
Data typeYes--data-typeNumeric precision the model runs at (BF16 or Auto).
QuantisationYes--quantisationQuantization scheme (FP8 or INT8).
Tensor parallelYes--tensor-parallelNumber of GPUs to split the model across.
GPU memory utilizationYes--gpu-memory-utilizationFraction of GPU memory to use, between 0 and 1.
Max concurrent sequencesYes--max-concurrent-sequencesMaximum requests handled at once.
Max model lengthYes--max-model-lengthMaximum model context length.
Reasoning parserYes--reasoning-parserParser for the model's reasoning output.
Video & image inputYes--multimodalWhether the model accepts multimodal input. true or false.
Root disk sizeYes--root-disk-size-gbVM root disk size in GB. One of 250, 500, or 1000.
Environment variablesNo--envEnvironment variables passed to the inference service.
Endpoint nameYes--endpoint-nameName of the inference endpoint. Must be unique across active inference jobs.
ReplicasYes--replicaNumber of endpoint replicas.
Require Authorization headerYes--require-auth-header / --no-auth-headerWhether callers must send an authorization header. true or false.

Managed Inference (Parakeet) parameters

These fields apply to a Parakeet Managed Inference Job.

FieldRequiredCLI flagDescription
ModelYes--modelParakeet model to serve, nvidia/parakeet-tdt-0.6b-v3.
Endpoint nameYes--endpoint-nameName of the transcription endpoint.
Chunk durationYes--chunk-durationAudio chunk length in seconds (60).
Chunk overlapYes--chunk-overlapOverlap between chunks in seconds (10).
Max file size (MB)Yes--max-file-size-mbMaximum upload size in MB (2048).
Require Authorization headerYes--require-auth-header / --no-auth-headerWhether callers must send an authorization header. true or false.

On this page