# Skill: WWS (Serverless Functions) Overview

## Description
Architecture and implementation guide for WWS (Web Worker Services) - serverless functions for edge computing in the Mylder platform.

## What is WWS?
WWS is our serverless function layer that runs lightweight compute at the edge, reducing VPS load and improving response times. Similar to Cloudflare Workers but platform-agnostic.

## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│                        USER REQUEST                              │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                      EDGE LAYER (WWS)                            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │   Auth   │  │  Rate    │  │  Model   │  │  Cache   │        │
│  │  Guard   │  │  Limiter │  │  Router  │  │  Manager │        │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘        │
└─────────────────────────────────────────────────────────────────┘
                              │
           ┌──────────────────┼──────────────────┐
           │                  │                  │
           ▼                  ▼                  ▼
    ┌──────────┐       ┌──────────┐       ┌──────────┐
    │synthetic │       │   z.ai   │       │  Origin  │
    │   .new   │       │(fallback)│       │   VPS    │
    └──────────┘       └──────────┘       └──────────┘
```

## Core Functions

### 1. Auth Guard
```javascript
// Edge authentication validation
export async function authGuard(request, env) {
  const token = request.headers.get('Authorization')?.split('Bearer ')[1];

  // Fast KV lookup
  const session = await env.SESSIONS.get(token);
  if (session) {
    return { valid: true, user: JSON.parse(session), source: 'cache' };
  }

  // Fallback to Supabase
  const supabaseUser = await validateWithSupabase(token, env);
  if (supabaseUser) {
    await env.SESSIONS.put(token, JSON.stringify(supabaseUser), {
      expirationTtl: 3600 // 1 hour
    });
    return { valid: true, user: supabaseUser, source: 'origin' };
  }

  return { valid: false };
}
```

### 2. Rate Limiter
```javascript
// Per-user, per-endpoint rate limiting
export async function rateLimiter(request, env) {
  const userId = request.headers.get('X-User-ID') || 'anonymous';
  const endpoint = new URL(request.url).pathname;
  const key = `rate:${userId}:${endpoint}`;

  const current = await env.RATE_LIMITS.get(key) || '0';
  const count = parseInt(current, 10);

  const limits = {
    '/api/ai/chat': 60,      // 60 requests per minute
    '/api/ai/generate': 20,   // 20 per minute
    'default': 100            // 100 per minute
  };

  const limit = limits[endpoint] || limits.default;

  if (count >= limit) {
    return {
      allowed: false,
      retryAfter: 60,
      remaining: 0
    };
  }

  await env.RATE_LIMITS.put(key, String(count + 1), {
    expirationTtl: 60
  });

  return {
    allowed: true,
    remaining: limit - count - 1
  };
}
```

### 3. Model Router
```javascript
// Intelligent AI model routing
export async function modelRouter(task, env) {
  const { type, complexity, context_length } = task;

  // Route based on task characteristics
  const routing = {
    // Primary: synthetic.new
    primary: {
      provider: 'synthetic.new',
      base_url: 'https://api.synthetic.new/openai/v1',
      models: {
        code: 'hf:deepseek-ai/DeepSeek-V3',
        reasoning: 'hf:moonshotai/Kimi-K2-Thinking'
      }
    },
    // Fallback: z.ai
    fallback: {
      provider: 'z.ai',
      base_url: 'https://api.z.ai/v1',
      models: {
        code: 'glm-4-flash',
        reasoning: 'glm-4-plus',
        long_context: 'glm-4-long'
      }
    }
  };

  // Special cases for z.ai
  if (context_length > 128000) {
    return {
      ...routing.fallback,
      model: routing.fallback.models.long_context,
      reason: 'Context exceeds 128K, using GLM-4-Long'
    };
  }

  // Default: synthetic.new
  const modelType = ['code', 'implementation', 'debugging'].includes(type)
    ? 'code'
    : 'reasoning';

  return {
    ...routing.primary,
    model: routing.primary.models[modelType],
    reason: `Standard ${modelType} task`
  };
}
```

### 4. Cache Manager
```javascript
// Stale-while-revalidate caching
export async function cacheManager(request, env, handler) {
  const cacheKey = new URL(request.url).pathname;

  // Check cache
  const cached = await env.RESPONSE_CACHE.get(cacheKey, { type: 'json' });

  if (cached) {
    const age = Date.now() - cached.timestamp;
    const maxAge = 60000; // 1 minute
    const staleAge = 300000; // 5 minutes

    // Fresh cache
    if (age < maxAge) {
      return { data: cached.data, source: 'cache', age };
    }

    // Stale-while-revalidate
    if (age < staleAge) {
      // Return stale, revalidate in background
      env.ctx.waitUntil(revalidate(cacheKey, handler, env));
      return { data: cached.data, source: 'stale', age };
    }
  }

  // No cache, fetch fresh
  const fresh = await handler();
  await env.RESPONSE_CACHE.put(cacheKey, JSON.stringify({
    data: fresh,
    timestamp: Date.now()
  }));

  return { data: fresh, source: 'origin', age: 0 };
}

async function revalidate(key, handler, env) {
  const fresh = await handler();
  await env.RESPONSE_CACHE.put(key, JSON.stringify({
    data: fresh,
    timestamp: Date.now()
  }));
}
```

## Deployment Options

### Option 1: Cloudflare Workers
```yaml
# wrangler.toml
name = "mylder-wws"
main = "src/index.ts"
compatibility_date = "2024-12-01"

[[kv_namespaces]]
binding = "SESSIONS"
id = "xxx"

[[kv_namespaces]]
binding = "RATE_LIMITS"
id = "xxx"

[[kv_namespaces]]
binding = "RESPONSE_CACHE"
id = "xxx"
```

### Option 2: Deno Deploy
```typescript
// main.ts
import { serve } from "https://deno.land/std/http/server.ts";

serve(async (req) => {
  const url = new URL(req.url);

  if (url.pathname.startsWith('/api/ai/')) {
    return handleAIRequest(req);
  }

  return new Response('Not Found', { status: 404 });
}, { port: 8000 });
```

### Option 3: Self-Hosted (Docker)
```dockerfile
# Dockerfile.wws
FROM denoland/deno:1.38.0

WORKDIR /app
COPY . .

RUN deno cache main.ts

EXPOSE 8000
CMD ["deno", "run", "--allow-net", "--allow-env", "main.ts"]
```

## Integration Points

### n8n Workflow Integration
```json
{
  "name": "WWS Call",
  "type": "n8n-nodes-base.httpRequest",
  "parameters": {
    "method": "POST",
    "url": "https://wws.mylder.io/api/ai/chat",
    "headers": {
      "Authorization": "Bearer {{ $env.WWS_API_KEY }}",
      "X-Task-Type": "{{ $json.taskType }}"
    },
    "body": {
      "messages": "{{ $json.messages }}",
      "context": "{{ $json.context }}"
    }
  }
}
```

### Frontend Integration
```typescript
// lib/wws.ts
export async function callWWS(endpoint: string, data: any) {
  const response = await fetch(`${process.env.NEXT_PUBLIC_WWS_URL}${endpoint}`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${getSession().token}`
    },
    body: JSON.stringify(data)
  });

  const result = await response.json();

  // Handle rate limiting
  if (response.status === 429) {
    const retryAfter = response.headers.get('Retry-After');
    throw new RateLimitError(retryAfter);
  }

  return result;
}
```

## Latency Targets
| Operation | Target | Fallback |
|-----------|--------|----------|
| Auth validation | < 10ms | < 300ms (origin) |
| Rate limit check | < 5ms | N/A |
| Model routing | < 2ms | N/A |
| Cache hit | < 10ms | N/A |
| AI request (primary) | < 5s | < 10s (fallback) |

## Cost Model
```
Cloudflare Workers:
- Free: 100K requests/day
- Paid: $5/month + $0.50/million requests

KV Storage:
- Free: 100K reads/day, 1K writes/day
- Paid: $0.50/million reads, $5/million writes

Estimated (1M users/month):
- Workers: ~$5/month
- KV: ~$25/month
- Total: ~$30/month
```

## Related Skills
- `wws/edge-auth.md` - Detailed auth implementation
- `wws/rate-limit.md` - Rate limiting patterns
- `wws/model-router.md` - AI model selection
- `ai-providers/synthetic-new.md` - Primary AI provider
- `ai-providers/z-ai.md` - Fallback provider

## Token Budget
- Max input: 500 tokens
- Max output: 1200 tokens

## Model
- Recommended: sonnet (architecture understanding)