- Add synthetic.new skill (primary AI provider) - Add z.ai skill (fallback with GLM models) - Add lean backlog management skill with WSJF prioritization - Add lean prioritization skill with scheduling/parallelization - Add WWS serverless architecture overview 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
6.1 KiB
6.1 KiB
Skill: AI Provider - z.ai
Description
Fallback AI provider with GLM (General Language Model) support. Use when synthetic.new is unavailable or when GLM models are superior for specific tasks.
Status
FALLBACK - Use when:
- synthetic.new rate limits or errors
- GLM models outperform alternatives for the task
- New models available earlier on z.ai
Configuration
provider: z.ai
base_url: https://api.z.ai/v1
api_key_env: Z_AI_API_KEY
compatibility: openai
rate_limit: 60 requests/minute
Note: API key needs to be configured. Check z.ai dashboard for key generation.
Available Models
GLM-4-Plus (Reasoning & Analysis)
{
"model_id": "glm-4-plus",
"best_for": ["reasoning", "analysis", "chinese_language", "long_context"],
"context_window": 128000,
"max_output": 4096,
"temperature_range": [0.0, 1.0],
"recommended_temp": 0.5,
"strengths": ["Chinese content", "Logical reasoning", "Long documents"]
}
Use when:
- Complex logical reasoning
- Chinese language content
- Long document analysis
- Comparative analysis
GLM-4-Flash (Fast Responses)
{
"model_id": "glm-4-flash",
"best_for": ["quick_responses", "simple_tasks", "high_volume"],
"context_window": 32000,
"max_output": 2048,
"temperature_range": [0.0, 1.0],
"recommended_temp": 0.3,
"strengths": ["Speed", "Cost efficiency", "Simple tasks"]
}
Use when:
- Quick classification
- Simple transformations
- High-volume processing
- Cost-sensitive operations
GLM-4-Long (Extended Context)
{
"model_id": "glm-4-long",
"best_for": ["long_documents", "codebase_analysis", "summarization"],
"context_window": 1000000,
"max_output": 4096,
"temperature_range": [0.0, 1.0],
"recommended_temp": 0.3,
"strengths": ["1M token context", "Document processing", "Code analysis"]
}
Use when:
- Entire codebase analysis
- Long document summarization
- Multi-file code review
Model Selection Logic
function selectZAIModel(taskType, contextLength) {
// Context-based selection
if (contextLength > 128000) {
return 'glm-4-long';
}
const modelMap = {
// Fast operations
'classification': 'glm-4-flash',
'extraction': 'glm-4-flash',
'simple_qa': 'glm-4-flash',
// Complex reasoning
'reasoning': 'glm-4-plus',
'analysis': 'glm-4-plus',
'planning': 'glm-4-plus',
// Long context
'codebase': 'glm-4-long',
'summarization': 'glm-4-long',
// Default
'default': 'glm-4-plus'
};
return modelMap[taskType] || modelMap.default;
}
Fallback Logic
When to Fallback from synthetic.new
async function callWithFallback(systemPrompt, userPrompt, options = {}) {
const primaryResult = await callSyntheticAI(systemPrompt, userPrompt, options);
// Check for fallback conditions
if (primaryResult.error) {
const errorCode = primaryResult.error.code;
// Rate limit or server error - fallback to z.ai
if ([429, 500, 502, 503, 504].includes(errorCode)) {
console.log('Falling back to z.ai');
return await callZAI(systemPrompt, userPrompt, options);
}
}
return primaryResult;
}
GLM Superiority Conditions
function shouldPreferGLM(task) {
const glmSuperiorTasks = [
'chinese_translation',
'chinese_content',
'million_token_context',
'cost_optimization',
'new_model_testing'
];
return glmSuperiorTasks.includes(task.type);
}
n8n Integration
HTTP Request Node Configuration
{
"method": "POST",
"url": "https://api.z.ai/v1/chat/completions",
"headers": {
"Authorization": "Bearer {{ $env.Z_AI_API_KEY }}",
"Content-Type": "application/json"
},
"body": {
"model": "glm-4-plus",
"messages": [
{ "role": "system", "content": "{{ systemPrompt }}" },
{ "role": "user", "content": "{{ userPrompt }}" }
],
"max_tokens": 4000,
"temperature": 0.5
},
"timeout": 90000
}
Code Node Helper
// z.ai Request Helper for n8n Code Node
async function callZAI(systemPrompt, userPrompt, options = {}) {
const {
model = 'glm-4-plus',
maxTokens = 4000,
temperature = 0.5
} = options;
const response = await $http.request({
method: 'POST',
url: 'https://api.z.ai/v1/chat/completions',
headers: {
'Authorization': `Bearer ${$env.Z_AI_API_KEY}`,
'Content-Type': 'application/json'
},
body: {
model,
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: userPrompt }
],
max_tokens: maxTokens,
temperature
}
});
return response.choices[0].message.content;
}
Comparison: synthetic.new vs z.ai
| Feature | synthetic.new | z.ai |
|---|---|---|
| Primary Use | All tasks | Fallback + GLM tasks |
| Best Model (Code) | DeepSeek-V3 | GLM-4-Flash |
| Best Model (Reasoning) | Kimi-K2-Thinking | GLM-4-Plus |
| Max Context | 200K | 1M (GLM-4-Long) |
| Chinese Support | Good | Excellent |
| Rate Limit | 100/min | 60/min |
| Cost | Standard | Lower (Flash) |
Setup Instructions
1. Get API Key
- Visit https://z.ai/dashboard
- Create account or login
- Navigate to API Keys
- Generate new key
- Store as
Z_AI_API_KEYenvironment variable
2. Configure in Coolify
# Add to service environment variables
Z_AI_API_KEY=your_key_here
3. Test Connection
curl -X POST https://api.z.ai/v1/chat/completions \
-H "Authorization: Bearer $Z_AI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-4-flash",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 50
}'
Error Handling
| Error Code | Cause | Action |
|---|---|---|
| 401 | Invalid API key | Check Z_AI_API_KEY |
| 429 | Rate limit | Wait and retry |
| 400 | Invalid model | Check model name |
Related Skills
ai-providers/synthetic-new.md- Primary providercode/implement.md- Code generationdesign-thinking/ideate.md- Solution brainstorming
Token Budget
- Max input: 500 tokens
- Max output: 800 tokens
Model
- Recommended: haiku (configuration lookup)