# AWS

This document walks you through deploying the **Dynamiq GenAI Operating Platform** into **your own AWS VPC** from an AWS Marketplace subscription.\
It is aimed at DevOps engineers, SREs, software engineers, and data‑science practitioners who are comfortable with the AWS CLI and Kubernetes tooling.

***

### Table of Contents

1. [Prerequisites](#prerequisites)
2. [Subscribe on AWS Marketplace](#subscribe-on-aws-marketplace)
3. [Set your environment variables](#set-your-environment-variables)
4. [Create the prerequisite IAM roles](#create-the-prerequisite-iam-roles)
5. [Provision the EKS cluster](#provision-the-eks-cluster)
6. [Create the RDS database](#create-the-rds-database)
7. [Install Karpenter](#install-karpenter)
8. [Create the node pools](#create-the-node-pools)
9. [Install External Secrets & supporting add‑ons](#install-external-secrets-and-supporting-add-ons)
10. [Store Dynamiq secrets](#store-dynamiq-secrets)
11. [Create the Dynamiq service account](#create-the-dynamiq-service-account)
12. [Prepare the S3 bucket and Helm values](#prepare-the-s3-bucket-and-helm-values)
13. [Authenticate to ECR and deploy Dynamiq](#authenticate-to-ecr-and-deploy-dynamiq)
14. [Validate the deployment](#validate-the-deployment)
15. [Cleanup (optional)](#cleanup-optional)

***

### Prerequisites

* **AWS account** with Administrator‑level access (or the specific permissions listed below).
* **AWS CLI ≥ 2.15**, **kubectl ≥ 1.31**, **eksctl ≥ 0.175**, **Helm ≥ 3.14**, **jq**, and **envsubst** installed locally.
* Public or private **domain name** (e.g. `example.com`) that you control and are able to delegate to Route 53.
* At least **one VPC quota slot** for a new EKS cluster (eksctl will create the VPC by default).
* **Service quotas** for the EC2 instance families you plan to use (`m5` for platform nodes, `g5` for GPU nodes).

The acting IAM principal must be allowed to manage EKS, CloudFormation, IAM, RDS, Secrets Manager, S3, STS, and associated resources. For production we recommend deploying from a short‑lived CI user or assume‑role with the following AWS managed policies attached:

* `AmazonEKSClusterPolicy`
* `AmazonEKSServicePolicy`
* `AmazonEKSWorkerNodePolicy`
* `AmazonEC2ContainerRegistryPowerUser`
* `AmazonRDSFullAccess`
* `AWSCloudFormationFullAccess`
* `IAMFullAccess`
* `SecretsManagerReadWrite`
* `AmazonS3FullAccess`

***

### Subscribe on AWS Marketplace

1. Open the [Dynamiq GenAI Operating Platform listing](https://aws.amazon.com/marketplace) in your browser.
2. Click **Continue to Subscribe** → **Accept terms**.
3. Wait until the subscription status shows **Subscribed**.

*No additional Marketplace configuration is required; the Helm chart (deployed later) records usage automatically.*

***

### Set your environment variables

Edit only the three highlighted variables, then copy‑paste the whole block:

```bash
# ---------- BEGIN USER CONFIG -------------------
export AWS_DEFAULT_REGION="us-east-2"        # <— Change to your preferred AWS region
export CLUSTER_NAME="dynamiq-demo"           # <— Unique, lowercase, DNS‑compatible cluster name
export BASE_DOMAIN="example.com"             # <— Root or sub‑domain you control
# ---------- END USER CONFIG ---------------------

export K8S_VERSION="1.31"
export AWS_PARTITION="aws"

export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"

export AMD_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)"
export GPU_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-gpu/recommended/image_id --query Parameter.Value --output text)"
```

> \*\*Tip   \*\*Add `set -euo pipefail` to abort on errors; all commands below are idempotent unless otherwise noted.

***

### Create the prerequisite IAM roles

The CloudFormation template bundled with Dynamiq creates the minimal IAM roles and policies required by Karpenter and External Secrets.

```bash
aws cloudformation deploy \
  --stack-name "${CLUSTER_NAME}" \
  --template-file ./dynamiq-stack.yaml \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"
```

> **Wait** until the stack status reads **CREATE\_COMPLETE** (≈ 1–2 minutes).

***

### Provision the EKS cluster

Paste the snippet below **as‑is**; `envsubst` injects your variables inline:

```bash
envsubst < <(cat <<'EOF'
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "${K8S_VERSION}"
  tags:
    karpenter.sh/discovery: ${CLUSTER_NAME}

iam:
  withOIDC: true
  podIdentityAssociations:
  - serviceAccountName: karpenter
    namespace: kube-system
    roleName: ${CLUSTER_NAME}-karpenter
    permissionPolicyARNs:
      - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}
  - serviceAccountName: external-secrets
    namespace: external-secrets
    roleName: ${CLUSTER_NAME}-external-secrets
    permissionPolicyARNs:
      - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/ExternalSecretsPolicy-${CLUSTER_NAME}

iamIdentityMappings:
- arn: arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}
  username: system:node:{{EC2PrivateDNSName}}
  groups:
    - system:bootstrappers
    - system:nodes

managedNodeGroups:
- name: ${CLUSTER_NAME}-ng
  instanceType: m5.large
  desiredCapacity: 1
  minSize: 1
  maxSize: 2
  amiFamily: AmazonLinux2

addons:
- name: eks-pod-identity-agent
EOF
) | eksctl create cluster -f -
```

When the command completes you will have:

* An EKS cluster with one **m5.large** node.
* OIDC provider enabled for IAM Roles for Service Accounts (IRSA).

Retrieve a few handy values:

```bash
export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name "${CLUSTER_NAME}" --query 'cluster.endpoint' --output text)"
export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
export EXTERNALSECRETS_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-external-secrets"
```

***

### Create the RDS database

Dynamiq stores structured metadata in PostgreSQL. A convenience CloudFormation stack provisions a single‐AZ **db.t3.medium** instance with encrypted storage.

```bash
export RDS_PASSWORD="d$(date +%s | sha256sum | cut -c1-32)"  # generates a 32‑char password

aws cloudformation deploy \
  --stack-name "${CLUSTER_NAME}-rds" \
  --template-file ./dynamiq-stack-rds.yaml \
  --parameter-overrides ClusterName=${CLUSTER_NAME} DBMasterUserPassword=${RDS_PASSWORD}
```

> **Security note**  Store `RDS_PASSWORD` securely (e.g. in AWS Secrets Manager) after creation.

***

### Install Karpenter

```bash
aws iam create-service-linked-role --aws-service-name spot.amazonaws.com || true

helm registry logout public.ecr.aws 2>/dev/null || true
export KARPENTER_VERSION="1.0.6"

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "${KARPENTER_VERSION}" \
  --namespace kube-system \
  --create-namespace \
  --set replicas=1 \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait
```

***

### Create the node pools

The following manifests declare two node pools:

* **Platform (m5)**     for web/API workloads.
* **GPU (g5)**          for model inference.

```bash
# Platform nodes
cat <<'EOF' | envsubst | kubectl apply -f -
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: platform
spec:
  role: "KarpenterNodeRole-${CLUSTER_NAME}"
  amiFamily: AL2
  amiSelectorTerms:
    - id: ${AMD_AMI_ID}
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        deleteOnTermination: true
        encrypted: true
        iops: 3000
        throughput: 125
        volumeSize: 100Gi
        volumeType: gp3
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: m5
spec:
  disruption:
    budgets:
      - nodes: 10%
    consolidationPolicy: WhenUnderutilized
    expireAfter: 48h
  limits:
    cpu: 128
  template:
    spec:
      nodeClassRef:
        kind: EC2NodeClass
        name: platform
      requirements:
        - key: getdynamiq.ai/workload
          operator: In
          values: ["application"]
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ["m5"]
        - key: karpenter.k8s.aws/instance-size
          operator: In
          values: ["large","xlarge","2xlarge","4xlarge","8xlarge"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
EOF

# GPU nodes
cat <<'EOF' | envsubst | kubectl apply -f -
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: gpu
spec:
  role: "KarpenterNodeRole-${CLUSTER_NAME}"
  amiFamily: AL2
  amiSelectorTerms:
    - id: ${GPU_AMI_ID}
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        deleteOnTermination: true
        encrypted: true
        iops: 3000
        throughput: 125
        volumeSize: 300Gi
        volumeType: gp3
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: gpu-g5
spec:
  disruption:
    consolidationPolicy: WhenEmpty
    consolidateAfter: 1m0s
  limits:
    cpu: 256
  template:
    spec:
      nodeClassRef:
        kind: EC2NodeClass
        name: gpu
      requirements:
        - key: nvidia.com/gpu
          operator: In
          values: ["true"]
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ["g5"]
        - key: karpenter.k8s.aws/instance-size
          operator: In
          values: ["xlarge","2xlarge","4xlarge","8xlarge","16xlarge","12xlarge","24xlarge","48xlarge"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
      taints:
        - key: nvidia.com/gpu
          value: "true"
          effect: NoSchedule
EOF
```

***

### Install External Secrets & supporting add‑ons

```bash
# External Secrets Operator
helm upgrade --install external-secrets external-secrets \
  --repo https://charts.external-secrets.io \
  --namespace external-secrets \
  --create-namespace \
  --wait

# Ingress Nginx
helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.ingressClassResource.default=true \
  --wait
```

Create a **ClusterSecretStore** pointing External Secrets to Secrets Manager:

```bash
cat <<'EOF' | envsubst | kubectl apply -f -
apiVersion: external-secrets.io/v1
kind: ClusterSecretStore
metadata:
  name: dynamiq
spec:
  provider:
    aws:
      region: "${AWS_DEFAULT_REGION}"
      service: SecretsManager
EOF
```

***

### Store Dynamiq secrets

Update the placeholders *before* running:

```bash
cat > dynamiq_secrets.json <<'EOF'
{
  "AUTH_ACCESS_TOKEN_KEY": "<CHANGE_THIS_VALUE>",
  "AUTH_REFRESH_TOKEN_KEY": "<CHANGE_THIS_VALUE>",
  "AUTH_VERIFICATION_TOKEN_KEY": "<CHANGE_THIS_VALUE>",
  "AUTH_INTERNAL_TOKEN_KEY": "<CHANGE_THIS_VALUE>"
  "HUGGING_FACE_ACCESS_TOKEN": "<CHANGE_THIS_VALUE>",
  "SMTP_HOST": "<CHANGE_THIS_VALUE>",
  "SMTP_USERNAME": "<CHANGE_THIS_VALUE>",
  "SMTP_PASSWORD": "<CHANGE_THIS_VALUE>",
  "FIRECRAWL_API_KEY": "<CHANGE_THIS_VALUE>",
  "TOGETHER_API_KEY": "<CHANGE_THIS_VALUE>",
  "OPENAI_API_KEY": "<CHANGE_THIS_VALUE>"
}
EOF

aws secretsmanager create-secret \
  --name DYNAMIQ \
  --description "Dynamiq Platform Secret" \
  --secret-string file://dynamiq_secrets.json
```

***

### Create the Dynamiq service account

```bash
kubectl create namespace dynamiq

eksctl create iamserviceaccount \
  --name dynamiq-aws \
  --namespace dynamiq \
  --cluster ${CLUSTER_NAME} \
  --attach-policy-arn arn:aws:iam::aws:policy/AWSMarketplaceMeteringFullAccess \
  --attach-policy-arn arn:aws:iam::aws:policy/AWSMarketplaceMeteringRegisterUsage \
  --attach-policy-arn arn:aws:iam::aws:policy/service-role/AWSLicenseManagerConsumptionPolicy \
  --approve \
  --override-existing-serviceaccounts
```

***

### Prepare the S3 bucket and Helm values

```bash
export STORAGE_S3_BUCKET="${CLUSTER_NAME}-data-$(openssl rand -hex 4)"

aws s3api create-bucket \
  --bucket "${STORAGE_S3_BUCKET}" \
  --region "${AWS_DEFAULT_REGION}" \
  --create-bucket-configuration LocationConstraint="${AWS_DEFAULT_REGION}"
```

Create a **local.values.yaml** file with domain overrides:

```bash
envsubst <<EOF > local.values.yaml                   
dynamiq:
  domain: ${BASE_DOMAIN}

nexus:
  image:
    repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/dynamiq/enterprise/nexus
  ingress:
    enabled: true
  externalSecrets:
    enabled: true
  configMapData:
    SMTP_FROM_NAME: 'Dynamiq'
    SMTP_FROM_EMAIL: 'noreply@dynamiq.local'
    STORAGE_SERVICE: s3
    STORAGE_S3_BUCKET: ${STORAGE_S3_BUCKET}

synapse:
  image:
    repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/dynamiq/enterprise/synapse
  ingress:
    enabled: true
  externalSecrets:
    enabled: true
  configMapData:
    STORAGE_SERVICE: s3
    STORAGE_S3_BUCKET: ${STORAGE_S3_BUCKET}

catalyst:
  image:
    repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/dynamiq/enterprise/catalyst
  externalSecrets:
    enabled: true
  configMapData:
    STORAGE_SERVICE: s3
    STORAGE_S3_BUCKET: ${STORAGE_S3_BUCKET}

ui:
  image:
    repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/dynamiq/enterprise/ui
  ingress:
    enabled: true
  configMapData: {}
EOF
```

***

### Authenticate to ECR and deploy Dynamiq

```bash
aws ecr get-login-password --region us-east-1 | \
  helm registry login --username AWS --password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.com

helm upgrade --install dynamiq oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/dynamiq/enterprise/dynamiq \
  --namespace dynamiq \
  --values local.values.yaml \
  --wait
```

***

### Validate the deployment

```bash
kubectl get ingress -n dynamiq -o wide
```

Create `A` or `CNAME` records for the three hostnames (`nexus`, `ui`) in Route 53 pointing to the Load Balancer address shown in the `ADDRESS` column. Once DNS propagates you should be able to visit:

* `https://app.${BASE_DOMAIN}` — Dynamiq web console
* `https://api.${BASE_DOMAIN}` — Dynamiq API

***

### Cleanup (optional)

The following commands remove **all** resources created by this guide. **Irreversible!**

```bash
helm uninstall dynamiq -n dynamiq || true
helm uninstall karpenter -n kube-system || true
kubectl delete nodeclaims --all || true

aws cloudformation delete-stack --stack-name "${CLUSTER_NAME}-rds"
aws cloudformation delete-stack --stack-name "${CLUSTER_NAME}"
eksctl delete cluster --name "${CLUSTER_NAME}"

aws s3api delete-bucket --bucket "${STORAGE_S3_BUCKET}"
```

***

### Next Steps

* Enable **HTTPS** by attaching an AWS Certificate Manager (ACM) certificate to the ALB Ingress Controller or by terminating TLS at an external load balancer.
* Adjust **Karpenter NodePool limits** to meet your workload demands.
* Integrate with your **observability stack** (Dynatrace, Datadog, CloudWatch) using Helm `--set` overrides.

Enjoy building with Dynamiq! ✨


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.getdynamiq.ai/on-premise-deployment/aws.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
