- Add synthetic.new skill (primary AI provider) - Add z.ai skill (fallback with GLM models) - Add lean backlog management skill with WSJF prioritization - Add lean prioritization skill with scheduling/parallelization - Add WWS serverless architecture overview 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
262 lines
6.1 KiB
Markdown
262 lines
6.1 KiB
Markdown
# Skill: AI Provider - z.ai
|
|
|
|
## Description
|
|
Fallback AI provider with GLM (General Language Model) support. Use when synthetic.new is unavailable or when GLM models are superior for specific tasks.
|
|
|
|
## Status
|
|
**FALLBACK** - Use when:
|
|
1. synthetic.new rate limits or errors
|
|
2. GLM models outperform alternatives for the task
|
|
3. New models available earlier on z.ai
|
|
|
|
## Configuration
|
|
```yaml
|
|
provider: z.ai
|
|
base_url: https://api.z.ai/v1
|
|
api_key_env: Z_AI_API_KEY
|
|
compatibility: openai
|
|
rate_limit: 60 requests/minute
|
|
```
|
|
|
|
**Note:** API key needs to be configured. Check z.ai dashboard for key generation.
|
|
|
|
## Available Models
|
|
|
|
### GLM-4-Plus (Reasoning & Analysis)
|
|
```json
|
|
{
|
|
"model_id": "glm-4-plus",
|
|
"best_for": ["reasoning", "analysis", "chinese_language", "long_context"],
|
|
"context_window": 128000,
|
|
"max_output": 4096,
|
|
"temperature_range": [0.0, 1.0],
|
|
"recommended_temp": 0.5,
|
|
"strengths": ["Chinese content", "Logical reasoning", "Long documents"]
|
|
}
|
|
```
|
|
|
|
**Use when:**
|
|
- Complex logical reasoning
|
|
- Chinese language content
|
|
- Long document analysis
|
|
- Comparative analysis
|
|
|
|
### GLM-4-Flash (Fast Responses)
|
|
```json
|
|
{
|
|
"model_id": "glm-4-flash",
|
|
"best_for": ["quick_responses", "simple_tasks", "high_volume"],
|
|
"context_window": 32000,
|
|
"max_output": 2048,
|
|
"temperature_range": [0.0, 1.0],
|
|
"recommended_temp": 0.3,
|
|
"strengths": ["Speed", "Cost efficiency", "Simple tasks"]
|
|
}
|
|
```
|
|
|
|
**Use when:**
|
|
- Quick classification
|
|
- Simple transformations
|
|
- High-volume processing
|
|
- Cost-sensitive operations
|
|
|
|
### GLM-4-Long (Extended Context)
|
|
```json
|
|
{
|
|
"model_id": "glm-4-long",
|
|
"best_for": ["long_documents", "codebase_analysis", "summarization"],
|
|
"context_window": 1000000,
|
|
"max_output": 4096,
|
|
"temperature_range": [0.0, 1.0],
|
|
"recommended_temp": 0.3,
|
|
"strengths": ["1M token context", "Document processing", "Code analysis"]
|
|
}
|
|
```
|
|
|
|
**Use when:**
|
|
- Entire codebase analysis
|
|
- Long document summarization
|
|
- Multi-file code review
|
|
|
|
## Model Selection Logic
|
|
```javascript
|
|
function selectZAIModel(taskType, contextLength) {
|
|
// Context-based selection
|
|
if (contextLength > 128000) {
|
|
return 'glm-4-long';
|
|
}
|
|
|
|
const modelMap = {
|
|
// Fast operations
|
|
'classification': 'glm-4-flash',
|
|
'extraction': 'glm-4-flash',
|
|
'simple_qa': 'glm-4-flash',
|
|
|
|
// Complex reasoning
|
|
'reasoning': 'glm-4-plus',
|
|
'analysis': 'glm-4-plus',
|
|
'planning': 'glm-4-plus',
|
|
|
|
// Long context
|
|
'codebase': 'glm-4-long',
|
|
'summarization': 'glm-4-long',
|
|
|
|
// Default
|
|
'default': 'glm-4-plus'
|
|
};
|
|
|
|
return modelMap[taskType] || modelMap.default;
|
|
}
|
|
```
|
|
|
|
## Fallback Logic
|
|
|
|
### When to Fallback from synthetic.new
|
|
```javascript
|
|
async function callWithFallback(systemPrompt, userPrompt, options = {}) {
|
|
const primaryResult = await callSyntheticAI(systemPrompt, userPrompt, options);
|
|
|
|
// Check for fallback conditions
|
|
if (primaryResult.error) {
|
|
const errorCode = primaryResult.error.code;
|
|
|
|
// Rate limit or server error - fallback to z.ai
|
|
if ([429, 500, 502, 503, 504].includes(errorCode)) {
|
|
console.log('Falling back to z.ai');
|
|
return await callZAI(systemPrompt, userPrompt, options);
|
|
}
|
|
}
|
|
|
|
return primaryResult;
|
|
}
|
|
```
|
|
|
|
### GLM Superiority Conditions
|
|
```javascript
|
|
function shouldPreferGLM(task) {
|
|
const glmSuperiorTasks = [
|
|
'chinese_translation',
|
|
'chinese_content',
|
|
'million_token_context',
|
|
'cost_optimization',
|
|
'new_model_testing'
|
|
];
|
|
|
|
return glmSuperiorTasks.includes(task.type);
|
|
}
|
|
```
|
|
|
|
## n8n Integration
|
|
|
|
### HTTP Request Node Configuration
|
|
```json
|
|
{
|
|
"method": "POST",
|
|
"url": "https://api.z.ai/v1/chat/completions",
|
|
"headers": {
|
|
"Authorization": "Bearer {{ $env.Z_AI_API_KEY }}",
|
|
"Content-Type": "application/json"
|
|
},
|
|
"body": {
|
|
"model": "glm-4-plus",
|
|
"messages": [
|
|
{ "role": "system", "content": "{{ systemPrompt }}" },
|
|
{ "role": "user", "content": "{{ userPrompt }}" }
|
|
],
|
|
"max_tokens": 4000,
|
|
"temperature": 0.5
|
|
},
|
|
"timeout": 90000
|
|
}
|
|
```
|
|
|
|
### Code Node Helper
|
|
```javascript
|
|
// z.ai Request Helper for n8n Code Node
|
|
async function callZAI(systemPrompt, userPrompt, options = {}) {
|
|
const {
|
|
model = 'glm-4-plus',
|
|
maxTokens = 4000,
|
|
temperature = 0.5
|
|
} = options;
|
|
|
|
const response = await $http.request({
|
|
method: 'POST',
|
|
url: 'https://api.z.ai/v1/chat/completions',
|
|
headers: {
|
|
'Authorization': `Bearer ${$env.Z_AI_API_KEY}`,
|
|
'Content-Type': 'application/json'
|
|
},
|
|
body: {
|
|
model,
|
|
messages: [
|
|
{ role: 'system', content: systemPrompt },
|
|
{ role: 'user', content: userPrompt }
|
|
],
|
|
max_tokens: maxTokens,
|
|
temperature
|
|
}
|
|
});
|
|
|
|
return response.choices[0].message.content;
|
|
}
|
|
```
|
|
|
|
## Comparison: synthetic.new vs z.ai
|
|
|
|
| Feature | synthetic.new | z.ai |
|
|
|---------|---------------|------|
|
|
| Primary Use | All tasks | Fallback + GLM tasks |
|
|
| Best Model (Code) | DeepSeek-V3 | GLM-4-Flash |
|
|
| Best Model (Reasoning) | Kimi-K2-Thinking | GLM-4-Plus |
|
|
| Max Context | 200K | 1M (GLM-4-Long) |
|
|
| Chinese Support | Good | Excellent |
|
|
| Rate Limit | 100/min | 60/min |
|
|
| Cost | Standard | Lower (Flash) |
|
|
|
|
## Setup Instructions
|
|
|
|
### 1. Get API Key
|
|
1. Visit https://z.ai/dashboard
|
|
2. Create account or login
|
|
3. Navigate to API Keys
|
|
4. Generate new key
|
|
5. Store as `Z_AI_API_KEY` environment variable
|
|
|
|
### 2. Configure in Coolify
|
|
```bash
|
|
# Add to service environment variables
|
|
Z_AI_API_KEY=your_key_here
|
|
```
|
|
|
|
### 3. Test Connection
|
|
```bash
|
|
curl -X POST https://api.z.ai/v1/chat/completions \
|
|
-H "Authorization: Bearer $Z_AI_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "glm-4-flash",
|
|
"messages": [{"role": "user", "content": "Hello"}],
|
|
"max_tokens": 50
|
|
}'
|
|
```
|
|
|
|
## Error Handling
|
|
| Error Code | Cause | Action |
|
|
|------------|-------|--------|
|
|
| 401 | Invalid API key | Check Z_AI_API_KEY |
|
|
| 429 | Rate limit | Wait and retry |
|
|
| 400 | Invalid model | Check model name |
|
|
|
|
## Related Skills
|
|
- `ai-providers/synthetic-new.md` - Primary provider
|
|
- `code/implement.md` - Code generation
|
|
- `design-thinking/ideate.md` - Solution brainstorming
|
|
|
|
## Token Budget
|
|
- Max input: 500 tokens
|
|
- Max output: 800 tokens
|
|
|
|
## Model
|
|
- Recommended: haiku (configuration lookup)
|