Data extractor

DX offers a self-hosted Data Extractor for customers who need to keep API credentials within their network or cannot allowlist incoming requests from DX. It connects to on-prem tools like GitLab and Jira, and pushes metadata to your Data Cloud database.

The Extractor is distributed as a Docker image. You’ll run a separate instance (e.g., a K8s pod) for each data source. For example, to connect both GitLab and Jira, you would deploy two Extractor instances, each configured with environment variables for its respective tool.

Docker images are distributed via GitHub Package Registry:

https://github.com/orgs/get-dx/packages/container/package/extractor

Requirements

Deployment

Recommended method: Kubernetes (GKE, EKS, AKS)

  1. Create a new Kubernetes cluster
  2. Set up logging for support/debugging
  3. Copy and customize the appropriate deployment YAML (see below)
  4. Run kubectl apply to deploy
  5. Use kubectl logs to verify startup

Monitoring

DX monitors import success. For additional monitoring:

  • Check logs for crashes or failed imports
  • Monitor pod restartCount
  • Alert on log patterns

YAML Templates

GitHub

Required environment variables

Name Description
EXTRACTION_TYPE Must be set to github

Example:
github
DATACLOUD_URL Your Data Cloud instance URL.

Example:
https://yourinstance.getdx.net
DATACLOUD_KEY Data Cloud API key.

Example:
mPB5sf6w3JahSLMherWp8B7nTps13FKY
GITHUB_URL API base URL of your GitHub instance.

Example:
https://github.myteam.com/api/v3/
GITHUB_APP_ID GitHub App ID

Example:
320840
GITHUB_PEM_64 Base64 encoded content of your PEM file.
EXTRACTOR_PROXY_URL Proxy URL - Optional. Acts as middleware to forward API requests to DataCloud.

Example:
proxy.getdx.net
EXTRACTOR_PROXY_PORT Proxy port

Example:
80
EXTRACTOR_PROXY_USER Proxy username

Example:
dxuser
EXTRACTOR_PROXY_PASS Proxy password
GITHUB_EXTRACT_PULL_COMMITS Optional. Enhanced GitHub extraction that pulls commits for each pull request.

Example:
true
GITHUB_EXTRACT_TRUNK_COMMITS Optional. Extract commits from the default branch of each repository—useful for teams using trunk-based development. Requires contents:read permission on your GitHub App.

Example:
true
GITHUB_EXTRACT_ONLY_PRIVATE_REPOS Optional. Only extract private repositories (by default, both public and private repositories are extracted).

Example:
true
GITHUB_EXTRACT_FORKED_REPOS Optional. Extract forked repositories (by default, forked repositories are skipped).

Example:
true
GITHUB_EXTRACT_ARCHIVED_REPOS Optional. Extract archived repositories (by default, archived repositories are skipped).

Example:
true

Kubernetes deployment YAML template (GitHub)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-github
spec:
  replicas: 1
  selector:
	matchLabels:
  	app: dx-extractor-github
  template:
	metadata:
  	labels:
    	app: dx-extractor-github
	spec:
  	containers:
  	- name: dx-extractor
    	image: ghcr.io/get-dx/extractor:latest
    	env:
    	- name: DATACLOUD_URL
      	valueFrom:
        	secretKeyRef:
          	name: github-connector-secrets
          	key: DATACLOUD_URL
    	- name: DATACLOUD_KEY
      	valueFrom:
        	secretKeyRef:
          	name: github-connector-secrets
          	key: DATACLOUD_KEY
    	- name: EXTRACTION_TYPE
      	value: "github"
    	- name: GITHUB_PEM_64
      	valueFrom:
        	secretKeyRef:
          	name: github-connector-secrets
          	key: GITHUB_PEM_64
    	- name: GITHUB_URL
      	value: "https://api.github.com"
    	- name: GITHUB_APP_ID
      	valueFrom:
        	secretKeyRef:
          	name: github-connector-secrets
          	key: GITHUB_APP_ID
    	- name: LOG_LEVEL
      	value: "DEBUG"
    	- name: LOG_FORMAT
      	value: "json"
  	restartPolicy: Always

GitLab

Required environment variables

Name Description
EXTRACTION_TYPE Must be set to gitlab

Example:
gitlab
DATACLOUD_URL Your Data Cloud instance URL.

Example:
https://yourinstance.getdx.net
DATACLOUD_KEY Data Cloud API key.

Example:
mPB5sf6w3JahSLMherWp8B7nTps13FKY
GITLAB_URL API base URL of your GitLab instance.

Example:
https://gitlab.com/
GITLAB_API_TOKEN GitLab App ID

Example:
glpat-31RAZpMWxzX\_m9BBnLyY
EXTRACTOR_PROXY_URL Proxy URL for to send api request to datacloud

Example: proxy.getdx.net
EXTRACTOR_PROXY_PORT Proxy port

Example:
80
EXTRACTOR_PROXY_USER Proxy username

Example:
dxuser
EXTRACTOR_PROXY_PASS Proxy password
GITLAB_EXTRACT_FORKED_PROJECTS Optional. Extract forked projects (by default, forked projects are skipped).

Example:
true
GITLAB_EXTRACT_ARCHIVED_PROJECTS Optional. Extract archived projects (by default, archived projects are skipped).

Example:
true

Kubernetes deployment YAML template (GitLab)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-gitlab
spec:
  replicas: 1
  selector:
	matchLabels:
  	app: dx-extractor-gitlab
  template:
	metadata:
  	labels:
    	app: dx-extractor-gitlab
	spec:
  	containers:
  	- name: dx-extractor
    	image: ghcr.io/get-dx/extractor:latest
    	env:
    	- name: DATACLOUD_URL
      	valueFrom:
        	secretKeyRef:
          	name: gitlab-connector-secrets
          	key: DATACLOUD_URL
    	- name: DATACLOUD_KEY
      	valueFrom:
        	secretKeyRef:
          	name: gitlab-connector-secrets
          	key: DATACLOUD_KEY
    	- name: EXTRACTION_TYPE
      	value: "gitlab"
    	- name: GITLAB_URL
      	value: "https://gitlab.com/"
    	- name: GITLAB_API_TOKEN
      	valueFrom:
        	secretKeyRef:
          	name: github-connector-secrets
          	key: GITLAB_API_TOKEN
    	- name: LOG_LEVEL
      	value: "DEBUG"
    	- name: LOG_FORMAT
      	value: "json"
  	restartPolicy: Always

Bitbucket Data Center

Required environment variables

Name Description
EXTRACTION_TYPE Must be set to bitbucket_data_center

Example: bitbucket_data_center
DATACLOUD_URL Your Data Cloud instance URL.

Example:
https://yourinstance.getdx.net
DATACLOUD_KEY Data Cloud API key.

Example:
mPB5sf6w3JahSLMherWp8B7nTps13FKY
BITBUCKET_URL API base URL of your Bitbucket Data Center instance.

Example:
https://bitbucket.somehost.net
BITBUCKET_USERNAME Username of your Bitbucket service account (if using basic auth).

Example:
dxuser
BITBUCKET_PASSWORD Password of your Bitbucket service account (if using basic auth).

Example:
password
BITBUCKET_API_KEY API key of your Bitbucket service account if not using Basic Auth

Example:
api\_key
EXTRACTOR_PROXY_URL Proxy URL for to send api request to datacloud

Example:
proxy.getdx.net
EXTRACTOR_PROXY_PORT Proxy port

Example:
80
EXTRACTOR_PROXY_USER Proxy username

Example:
dxuser
EXTRACTOR_PROXY_PASS Proxy password
BITBUCKET_PROJECT_KEYS_ALLOWLIST (optional) Comma-delimited list of project keys for DX to import

Example:
PROJ1,PROJ2
BITBUCKET_IMPORT_COMMITS (optional) Set to true to enable commit data ingestion from Bitbucket Data Center. Your Data Cloud environment must have the Bitbucket Data Center commits schema applied for commits to be stored—see conditional schemas.

Example:
true

Kubernetes deployment YAML template (Bitbucket Data Center)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-bitbucket
spec:
  replicas: 1
  selector:
	matchLabels:
  	app: dx-extractor-bitbucket
  template:
	metadata:
  	labels:
    	app: dx-extractor-bitbucket
	spec:
  	containers:
  	- name: dx-extractor
    	image: ghcr.io/get-dx/extractor:latest
    	env:
    	- name: DATACLOUD_URL
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_URL
    	- name: DATACLOUD_KEY
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_KEY
    	- name: EXTRACTION_TYPE
      	value: "bitbucket_data_center"
    	- name: BITBUCKET_URL
      	value: "https://bitbucket.somehost.net"
      - name: BITBUCKET_API_KEY
            valueFrom:
            secretKeyRef:
            name: dx-secrets
            key: BITBUCKET_API_KEY
    	- name: BITBUCKET_USERNAME    # Required for basic auth only
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: BITBUCKET_USERNAME
    	- name: BITBUCKET_PASSWORD    # Required for basic auth only
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: BITBUCKET_PASSWORD
    	- name: LOG_LEVEL
      	value: "DEBUG"
    	- name: LOG_FORMAT
      	value: "json"
  	restartPolicy: Always

Jira Data Center

Required environment variables

Name Description
EXTRACTION_TYPE Must be set to jira_data_center

Example:
jira_data_center
DATACLOUD_URL Your Data Cloud instance URL.

Example:
https://yourinstance.getdx.net
DATACLOUD_KEY Data Cloud API key.

Example:
mPB5sf6w3JahSLMherWp8B7nTps13FKY
JIRA_URL API base URL of your Jira Data Center instance.

Example:
https://jira.somehost.net/rest/api/2/
JIRA_API_TOKEN Personal Access Token (PAT) for your Jira service account.

Example:
mPB5sf6w3JahSLMherWp8B7nTps13FKY
JIRA_USERNAME Username of your Jira service account (if using basic auth).

Example:
dxuser
JIRA_PASSWORD Password of your Jira service account (if using basic auth).

Example:
password
EXTRACTOR_PROXY_URL Proxy URL for to send api request to datacloud

Example:
proxy.getdx.net
EXTRACTOR_PROXY_PORT Proxy port

Example:
80
EXTRACTOR_PROXY_USER Proxy username

Example:
dxuser
EXTRACTOR_PROXY_PASS Proxy password

User Linking
Unlike other Jira integrations, the Jira extractor does NOT extract user data by itself. Instead, as Jira issues come in, DX looks at the creator/assignee and create/updates the Jira user record in the database accordingly. This may cause delays in syncing user data or unlinked Jira usernames.

Kubernetes deployment YAML template (Jira Data Center)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-jira
spec:
  replicas: 1
  selector:
	matchLabels:
  	app: dx-extractor-jira
  template:
	metadata:
  	labels:
    	app: dx-extractor-jira
	spec:
  	containers:
  	- name: dx-extractor
    	image: ghcr.io/get-dx/extractor:latest
    	env:
    	- name: DATACLOUD_URL
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_URL
    	- name: DATACLOUD_KEY
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_KEY
    	- name: EXTRACTION_TYPE
      	value: "jira_data_center"
    	- name: JIRA_URL
      	value: "https://jira.somehost.net/rest/api/2/"
      - name: JIRA_API_TOKEN
            valueFrom:
            secretKeyRef:
            name: dx-secrets
            key: JIRA_API_TOKEN
    	- name: JIRA_USERNAME    # Required for basic auth only
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: JIRA_USERNAME
    	- name: JIRA_PASSWORD    # Required for basic auth only
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: JIRA_PASSWORD
    	- name: LOG_LEVEL
      	value: "DEBUG"
    	- name: LOG_FORMAT
      	value: "json"
  	restartPolicy: Always

Azure DevOps (ADO) Server

Required environment variables

Name Description
EXTRACTION_TYPE Must be set to ado_server

Example:
ado_server
DATACLOUD_URL Your Data Cloud instance URL.

Example:
https://yourinstance.getdx.net
DATACLOUD_KEY Data Cloud API key.

Example:
mPB5sf6w3JahSLMherWp8B7nTps13FKY
ADO_SERVER_BASE_URL Base URL of your Azure DevOps Server instance.

Example:
https://devops.mycompany.com
ADO_SERVER_ORGANIZATION_NAME The organization or collection name in Azure DevOps Server.

Example:
DefaultCollection
ADO_SERVER_PERSONAL_ACCESS_TOKEN Personal Access Token for authenticating with Azure DevOps Server.

Example:
your-ado-personal-access-token
ADO_SERVER_CONNECTOR_TYPE Type of data to extract. Must be repos, boards, or pipelines.
repos: Extracts repository data
boards: Extracts work item/board data
pipelines: Extracts pipeline/build data

Example:
repos
EXTRACTOR_PROXY_URL Proxy URL for to send api request to datacloud

Example:
proxy.getdx.net
EXTRACTOR_PROXY_PORT Proxy port

Example:
80
EXTRACTOR_PROXY_USER Proxy username

Example:
dxuser
EXTRACTOR_PROXY_PASS Proxy password
EXTRACTOR_ID Unique identifier for the extractor instance. Required when running multiple instances (repos, boards, pipelines). Can be any random unique ID.

Example:
102

Docker Compose template (ADO Server)

services:
  extractor:
    image: ghcr.io/get-dx/extractor:perforce
    environment:
      DATACLOUD_URL: "https://yourinstance.getdx.net"
      DATACLOUD_KEY: "mPB5sf6w3JahSLMherWp8B7nTps13FKY"
      EXTRACTION_TYPE: "ado_server"
      ADO_SERVER_BASE_URL: "https://devops.mycompany.com"
      ADO_SERVER_ORGANIZATION_NAME: "DefaultCollection"
      ADO_SERVER_PERSONAL_ACCESS_TOKEN: "your-personal-access-token"
      ADO_SERVER_CONNECTOR_TYPE: "repos" # Use "repos", "boards", or "pipelines"
      EXTRACTOR_ID: "102"
      LOG_LEVEL: "DEBUG"
      LOG_FORMAT: "json"
    restart: always

Kubernetes deployment YAML template (ADO Server - Repos)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-ado-repos
spec:
  replicas: 1
  selector:
	matchLabels:
  	app: dx-extractor-ado-repos
  template:
	metadata:
  	labels:
    	app: dx-extractor-ado-repos
	spec:
  	containers:
  	- name: dx-extractor
    	image: ghcr.io/get-dx/extractor:latest
    	env:
    	- name: DATACLOUD_URL
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_URL
    	- name: DATACLOUD_KEY
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_KEY
    	- name: EXTRACTION_TYPE
      	value: "ado_server"
    	- name: ADO_SERVER_BASE_URL
      	value: "https://devops.mycompany.com"
    	- name: ADO_SERVER_ORGANIZATION_NAME
      	value: "DefaultCollection"
    	- name: ADO_SERVER_PERSONAL_ACCESS_TOKEN
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: ADO_SERVER_PERSONAL_ACCESS_TOKEN
    	- name: ADO_SERVER_CONNECTOR_TYPE
      	value: "repos"
    	- name: LOG_LEVEL
      	value: "DEBUG"
    	- name: LOG_FORMAT
      	value: "json"
  	restartPolicy: Always

Kubernetes deployment YAML template (ADO Server - Boards)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-ado-boards
spec:
  replicas: 1
  selector:
	matchLabels:
  	app: dx-extractor-ado-boards
  template:
	metadata:
  	labels:
    	app: dx-extractor-ado-boards
	spec:
  	containers:
  	- name: dx-extractor
    	image: ghcr.io/get-dx/extractor:latest
    	env:
    	- name: DATACLOUD_URL
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_URL
    	- name: DATACLOUD_KEY
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_KEY
    	- name: EXTRACTION_TYPE
      	value: "ado_server"
    	- name: ADO_SERVER_BASE_URL
      	value: "https://devops.mycompany.com"
    	- name: ADO_SERVER_ORGANIZATION_NAME
      	value: "DefaultCollection"
    	- name: ADO_SERVER_PERSONAL_ACCESS_TOKEN
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: ADO_SERVER_PERSONAL_ACCESS_TOKEN
    	- name: ADO_SERVER_CONNECTOR_TYPE
      	value: "boards"
    	- name: LOG_LEVEL
      	value: "DEBUG"
    	- name: LOG_FORMAT
      	value: "json"
  	restartPolicy: Always

Kubernetes deployment YAML template (ADO Server - Pipelines)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-ado-pipelines
spec:
  replicas: 1
  selector:
	matchLabels:
  	app: dx-extractor-ado-pipelines
  template:
	metadata:
  	labels:
    	app: dx-extractor-ado-pipelines
	spec:
  	containers:
  	- name: dx-extractor
    	image: ghcr.io/get-dx/extractor:latest
    	env:
    	- name: DATACLOUD_URL
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_URL
    	- name: DATACLOUD_KEY
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: DATACLOUD_KEY
    	- name: EXTRACTION_TYPE
      	value: "ado_server"
    	- name: ADO_SERVER_BASE_URL
      	value: "https://devops.mycompany.com"
    	- name: ADO_SERVER_ORGANIZATION_NAME
      	value: "DefaultCollection"
    	- name: ADO_SERVER_PERSONAL_ACCESS_TOKEN
      	valueFrom:
        	secretKeyRef:
          	name: dx-secrets
          	key: ADO_SERVER_PERSONAL_ACCESS_TOKEN
    	- name: ADO_SERVER_CONNECTOR_TYPE
      	value: "pipelines"
    	- name: LOG_LEVEL
      	value: "DEBUG"
    	- name: LOG_FORMAT
      	value: "json"
  	restartPolicy: Always

The REST API version supported by your server depends on which version of ADO Server you have installed:

ADO Server Version Maximum API Version
2019 5.0
2020 5.1
2022+ 6.0

DX defaults to API version 6.0.

If your server only supports an older version, set ADO_SERVER_API_VERSION to match your server’s maximum supported version (e.g. 5.0 for ADO Server 2019).

Perforce (Helix Core)

Setup instructions

  1. Create a dedicated Perforce service account for DX extraction.
  2. Ensure the account can access the depots and changelists you want DX to import.
  3. If you want review metadata from Helix Swarm, create a Swarm account with API access.
  4. Add the required environment variables below and deploy one extractor instance with EXTRACTION_TYPE=perforce_extractor.
  5. Verify startup logs show successful Perforce connection checks before relying on scheduled syncs.

Required and optional environment variables

Name Description
EXTRACTION_TYPE Must be set to perforce_extractor.

Example:
perforce_extractor
DATACLOUD_URL Your Data Cloud instance URL.

Example:
https://yourinstance.getdx.net
DATACLOUD_KEY Data Cloud API key.

Example:
mPB5sf6w3JahSLMherWp8B7nTps13FKY
PERFORCE_PORT Perforce server address and port (supports SSL endpoints).

Example:
ssl:perforce.company.com:1666
PERFORCE_USERNAME Perforce username for the service account.

Example:
dx-extractor
PERFORCE_PASSWORD Perforce password or ticket secret for the service account.
SWARM_URL Optional. Base URL of your Helix Swarm instance.

Example:
https://swarm.company.com
SWARM_USERNAME Optional. Swarm username (required when SWARM_URL is set).

Example:
dx-swarm
SWARM_PASSWORD Optional. Swarm password or token (required when SWARM_URL is set).
EXTRACTOR_ID Optional. Unique ID for this extractor instance. Recommended when running multiple Perforce extractors.

Example:
perforce-prod-1
EXTRACTOR_PROXY_URL Optional. Proxy URL to forward requests to Data Cloud.

Example:
proxy.getdx.net
EXTRACTOR_PROXY_PORT Optional. Proxy port.

Example:
80
EXTRACTOR_PROXY_USER Optional. Proxy username.

Example:
dxuser
EXTRACTOR_PROXY_PASS Optional. Proxy password.
SLEEP_DURATION Optional. Polling interval in seconds between extraction cycles.

Example:
300
LOG_LEVEL Optional. Log verbosity (debug, info, etc.).

Example:
debug
LOG_FORMAT Optional. Log format (json or text).

Example:
json

Docker Compose template (Perforce)

services:
  extractor:
    image: ghcr.io/get-dx/extractor:latest
    environment:
      DATACLOUD_URL: "https://yourinstance.getdx.net"
      DATACLOUD_KEY: "your-datacloud-api-key"
      EXTRACTION_TYPE: "perforce_extractor"
      PERFORCE_PORT: "ssl:perforce.company.com:1666"
      PERFORCE_USERNAME: "dx-extractor"
      PERFORCE_PASSWORD: "your-perforce-password-or-ticket"
      SWARM_URL: "https://swarm.company.com" # Optional
      SWARM_USERNAME: "dx-swarm" # Required when SWARM_URL is set
      SWARM_PASSWORD: "your-swarm-password-or-token" # Required when SWARM_URL is set
      EXTRACTOR_ID: "perforce-prod-1" # Recommended if multiple Perforce extractors are deployed
      LOG_LEVEL: "debug"
      LOG_FORMAT: "json"
    restart: always

Use the Perforce-specific image tags for Perforce extraction:

  • ghcr.io/get-dx/extractor:perforce (GLIBC/slim)
  • ghcr.io/get-dx/extractor:perforce-alpine (Alpine)

Versioned tags are also available:

  • ghcr.io/get-dx/extractor:<version>-perforce
  • ghcr.io/get-dx/extractor:<version>-perforce-alpine

If you build the extractor image from source instead of pulling these GHCR tags, include the Perforce SDK during build:

services:
  extractor:
    build:
      context: .
      args:
        INCLUDE_P4: "true"

Kubernetes deployment YAML template (Perforce)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dx-extractor-perforce
spec:
  replicas: 1
  selector:
    matchLabels:
      app: dx-extractor-perforce
  template:
    metadata:
      labels:
        app: dx-extractor-perforce
    spec:
      containers:
        - name: dx-extractor
          image: ghcr.io/get-dx/extractor:perforce
          env:
            - name: DATACLOUD_URL
              valueFrom:
                secretKeyRef:
                  name: perforce-connector-secrets
                  key: DATACLOUD_URL
            - name: DATACLOUD_KEY
              valueFrom:
                secretKeyRef:
                  name: perforce-connector-secrets
                  key: DATACLOUD_KEY
            - name: EXTRACTION_TYPE
              value: "perforce_extractor"
            - name: PERFORCE_PORT
              value: "ssl:perforce.company.com:1666"
            - name: PERFORCE_USERNAME
              valueFrom:
                secretKeyRef:
                  name: perforce-connector-secrets
                  key: PERFORCE_USERNAME
            - name: PERFORCE_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: perforce-connector-secrets
                  key: PERFORCE_PASSWORD
            - name: SWARM_URL
              value: "https://swarm.company.com" # Optional
            - name: SWARM_USERNAME
              valueFrom:
                secretKeyRef:
                  name: perforce-connector-secrets
                  key: SWARM_USERNAME
            - name: SWARM_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: perforce-connector-secrets
                  key: SWARM_PASSWORD
            - name: EXTRACTOR_ID
              value: "perforce-prod-1"
            - name: LOG_LEVEL
              value: "DEBUG"
            - name: LOG_FORMAT
              value: "json"
      restartPolicy: Always

P4Ruby SDK methods used

Wrapper method p4ruby call Purpose
list_depots p4.run_depots All depots
list_users p4.run_users("-a") All users (incl. inactive)
list_groups p4.run_groups All groups
list_user_groups(user) p4.run_groups("-u", user) Group memberships for a user
list_changes(...) p4.run_changes("-s", "submitted", "-m", max, "PATH@from,@to") Submitted changelists, newest first, by depot path + date range
describe_change_stats(cl) p4.run_describe("-ds", cl) (with p4.tagged = false) Diff summary (added/deleted/edited lines) for one changelist
server_info p4.run("info") Connection verification
connection lifecycle P4.new, p4.connect, p4.run_login, p4.disconnect Per-instruction connect/login/disconnect; ticket file at /tmp/.p4tickets, exception_level = P4::RAISE_ERRORS

Swarm REST API endpoints used

Wrapper method Endpoint Notes
list_reviews GET /api/v11/reviews Params: max, after (pagination cursor), state[]
get_review GET /api/v11/reviews/{id} Single review
list_review_activities GET /api/v11/reviews/{id}/activity Params: max, after
list_review_comments GET /api/v11/reviews/{id}/comments
server_info GET /api/v11/version Connection verification