Skip to main content

Problem Summary

In a Docker-in-Docker (DinD) setup where the worker container runs inside Docker and creates containers:
  1. Volume Path Mismatch: Volume mount paths are relative to the Docker daemon’s filesystem, not the worker container
  2. Security Risk: Shared volumes in multi-tenant SaaS allow cross-tenant data access
  3. Limited stdin Approach: Using stdin for data transfer doesn’t support file-based tools

Solution: Isolated Named Volumes

Use unique Docker named volumes created per tenantId + runId + timestamp:
tenant-${tenantId}-run-${runId}-${timestamp}

Architecture

┌─────────────────────────────────────────────────────────┐
│ Docker Host                                              │
│                                                          │
│  ┌─────────────────────────────────────────────┐        │
│  │ Worker Container (DinD)                     │        │
│  │                                              │        │
│  │  1. Creates volume via Docker CLI           │        │
│  │     docker volume create tenant-A-run-1-... │        │
│  │                                              │        │
│  │  2. Populates files using temp container    │        │
│  │     docker run -v vol:/data alpine sh -c .. │        │
│  │                                              │        │
│  │  3. Runs actual tool with volume mounted    │        │
│  │     docker run -v vol:/inputs dnsx ...      │        │
│  │                                              │        │
│  │  4. Reads output files using temp container │        │
│  │     docker run -v vol:/data alpine cat ...  │        │
│  │                                              │        │
│  │  5. Cleans up volume                        │        │
│  │     docker volume rm tenant-A-run-1-...     │        │
│  └─────────────────────────────────────────────┘        │
│                                                          │
│  ┌──────────────────────────────────────────┐           │
│  │ Docker Volumes (on Docker Host)          │           │
│  │                                           │           │
│  │  • tenant-A-run-123-1732090000           │           │
│  │  • tenant-B-run-456-1732090001           │           │
│  │  • tenant-A-run-789-1732090002           │           │
│  │                                           │           │
│  │  Each volume isolated per tenant + run   │           │
│  └──────────────────────────────────────────┘           │
└─────────────────────────────────────────────────────────┘

Security Benefits

AspectOld ApproachIsolated Volumes
Tenant Isolation❌ Shared volume or stdin✅ Unique volume per tenant+run
Path Traversal⚠️ Possible with file mounts✅ Validated filenames
Data Leakage❌ Files persist in shared space✅ Immediate cleanup
Audit Trail❌ None✅ Volume labels track tenant/run
DinD Compatible❌ File mounts don’t work✅ Named volumes work perfectly

Implementation

Before: File Mounting (Broken in DinD)

// ❌ WRONG - Breaks in DinD, no tenant isolation
const hostInputDir = await mkdtemp(path.join(tmpdir(), 'dnsx-input-'));
await writeFile(path.join(hostInputDir, 'file.txt'), data);

const runnerConfig: DockerRunnerConfig = {
  volumes: [
    { source: hostInputDir, target: '/inputs', readOnly: true }
  ]
};

After: Isolated Volumes (DinD Compatible)

// ✅ CORRECT - DinD compatible, tenant isolated
const tenantId = context.tenantId ?? 'default-tenant';
const volume = new IsolatedContainerVolume(tenantId, context.runId);

try {
  await volume.initialize({
    'domains.txt': domains.join('\n'),
    'resolvers.txt': resolvers.join('\n')
  });

  const runnerConfig: DockerRunnerConfig = {
    volumes: [volume.getVolumeConfig('/inputs', true)]
  };

  await runComponentWithRunner(runnerConfig, ...);

  const outputs = await volume.readFiles(['results.json']);

} finally {
  await volume.cleanup();
}

Comparison: All Approaches

FeatureFile Mountsstdin ApproachIsolated Volumes
DinD Compatible❌ No✅ Yes✅ Yes
File-based tools✅ Yes❌ No✅ Yes
Config files✅ Yes❌ No✅ Yes
Output files❌ Hard to read❌ No✅ Yes
Binary files✅ Yes❌ No✅ Yes
Large files✅ Yes⚠️ Memory limits✅ Yes
Tenant isolation❌ No⚠️ Process-level✅ Volume-level

Usage Examples

Simple Input Files

const volume = new IsolatedContainerVolume(tenantId, runId);

try {
  await volume.initialize({
    'targets.txt': targets.join('\n')
  });

  const config = {
    volumes: [volume.getVolumeConfig('/inputs', true)]
  };

  await runTool(config);
} finally {
  await volume.cleanup();
}

Input + Output Files

const volume = new IsolatedContainerVolume(tenantId, runId);

try {
  await volume.initialize({
    'config.yaml': yamlConfig
  });

  const config = {
    command: ['--input', '/data/config.yaml', '--output', '/data/results.json'],
    volumes: [volume.getVolumeConfig('/data', false)] // Read-write
  };

  await runTool(config);

  const outputs = await volume.readFiles(['results.json', 'summary.txt']);
  return JSON.parse(outputs['results.json']);

} finally {
  await volume.cleanup();
}

Multiple Volumes

const inputVol = new IsolatedContainerVolume(tenantId, `${runId}-in`);
const outputVol = new IsolatedContainerVolume(tenantId, `${runId}-out`);

try {
  await inputVol.initialize({ 'data.csv': csvData });
  await outputVol.initialize({}); // Empty volume for outputs

  const config = {
    volumes: [
      inputVol.getVolumeConfig('/inputs', true),
      outputVol.getVolumeConfig('/outputs', false)
    ]
  };

  await runTool(config);

  const results = await outputVol.readFiles(['output.json']);

} finally {
  await Promise.all([
    inputVol.cleanup(),
    outputVol.cleanup()
  ]);
}

Volume Lifecycle

  1. Create: docker volume create tenant-A-run-123-...
  2. Populate: Use temporary Alpine container to write files
  3. Mount: Container uses the volume via -v volumeName:/path
  4. Read: Use temporary Alpine container to read files
  5. Cleanup: docker volume rm tenant-A-run-123-...

Automatic Cleanup

Volumes are always cleaned up via finally blocks:
try {
  await volume.initialize(...);
  await runTool(...);
} finally {
  await volume.cleanup(); // Always runs, even on error
}

Orphan Cleanup

For volumes that weren’t cleaned up (e.g., worker crash):
# List studio-managed volumes
docker volume ls --filter "label=studio.managed=true"

# Remove old volumes
docker volume prune --filter "label=studio.managed=true"

Security Requirements

Tenant Isolation

Every execution gets a unique volume:
tenant-{tenantId}-run-{runId}-{timestamp}
Example: tenant-acme-run-wf-abc123-1732150000

Read-Only Mounts

// Input files should be read-only
volume.getVolumeConfig('/inputs', true)  // ✅ read-only

// Only make writable if tool needs to write
volume.getVolumeConfig('/outputs', false)  // ⚠️ read-write

Path Validation

Filenames are automatically validated:
// ✅ OK
await volume.initialize({
  'file.txt': data,
  'subdir/file.txt': data  // Subdirs OK
});

// ❌ Rejected (security)
await volume.initialize({
  '../file.txt': data,     // Path traversal blocked
  '/etc/passwd': data      // Absolute paths blocked
});

Security Guarantees

Security FeatureHow It Works
Tenant IsolationVolume name includes tenant ID
No CollisionsTimestamp prevents conflicts
Path SafetyFilenames validated (no .. or /)
Automatic CleanupFinally blocks guarantee removal
Audit TrailVolumes labeled studio.managed=true
DinD CompatibleNamed volumes work in nested Docker

Performance

Volume Creation Overhead

  • Creation: ~50-100ms per volume
  • File writes: ~10-50ms per file (depends on size)
  • Cleanup: ~50-100ms per volume
Total overhead: ~100-250ms per execution This is acceptable for security tools that typically run for seconds/minutes.

Optimization Tips

  1. Batch file writes: Write all files in one initialize() call
  2. Reuse volumes: For sequential operations in same run, reuse the volume
  3. Lazy cleanup: Clean up volumes in background job if latency-sensitive

When to Use Each Approach

Use Isolated Volumes When:

  • ✅ Running in DinD environment
  • ✅ Need multi-tenant isolation
  • ✅ Tool requires file-based config
  • ✅ Tool writes output files
  • ✅ Handling binary/large files

Use stdin/stdout When:

  • ✅ Tool supports stdin input
  • ✅ Single-tenant or dev environment
  • ✅ Small text-only inputs
  • ✅ Don’t need output files

Use File Mounts When:

  • ✅ NOT running in DinD (direct Docker)
  • ✅ Development/testing only
  • ✅ Quick prototyping

Migration Checklist

To migrate a component to use isolated volumes:
  • Import IsolatedContainerVolume
  • Get tenantId from context (use fallback for now)
  • Create volume instance: new IsolatedContainerVolume(tenantId, runId)
  • Replace file writes with volume.initialize({ files })
  • Replace volume mount with volume.getVolumeConfig()
  • Add finally block with volume.cleanup()
  • If tool writes outputs, use volume.readFiles() to retrieve them
  • Test in DinD environment