# Skill: AI Provider - z.ai ## Description Fallback AI provider with GLM (General Language Model) support. Use when synthetic.new is unavailable or when GLM models are superior for specific tasks. ## Status **FALLBACK** - Use when: 1. synthetic.new rate limits or errors 2. GLM models outperform alternatives for the task 3. New models available earlier on z.ai ## Configuration ```yaml provider: z.ai base_url: https://api.z.ai/v1 api_key_env: Z_AI_API_KEY compatibility: openai rate_limit: 60 requests/minute ``` **Note:** API key needs to be configured. Check z.ai dashboard for key generation. ## Available Models ### GLM-4-Plus (Reasoning & Analysis) ```json { "model_id": "glm-4-plus", "best_for": ["reasoning", "analysis", "chinese_language", "long_context"], "context_window": 128000, "max_output": 4096, "temperature_range": [0.0, 1.0], "recommended_temp": 0.5, "strengths": ["Chinese content", "Logical reasoning", "Long documents"] } ``` **Use when:** - Complex logical reasoning - Chinese language content - Long document analysis - Comparative analysis ### GLM-4-Flash (Fast Responses) ```json { "model_id": "glm-4-flash", "best_for": ["quick_responses", "simple_tasks", "high_volume"], "context_window": 32000, "max_output": 2048, "temperature_range": [0.0, 1.0], "recommended_temp": 0.3, "strengths": ["Speed", "Cost efficiency", "Simple tasks"] } ``` **Use when:** - Quick classification - Simple transformations - High-volume processing - Cost-sensitive operations ### GLM-4-Long (Extended Context) ```json { "model_id": "glm-4-long", "best_for": ["long_documents", "codebase_analysis", "summarization"], "context_window": 1000000, "max_output": 4096, "temperature_range": [0.0, 1.0], "recommended_temp": 0.3, "strengths": ["1M token context", "Document processing", "Code analysis"] } ``` **Use when:** - Entire codebase analysis - Long document summarization - Multi-file code review ## Model Selection Logic ```javascript function selectZAIModel(taskType, contextLength) { // Context-based selection if (contextLength > 128000) { return 'glm-4-long'; } const modelMap = { // Fast operations 'classification': 'glm-4-flash', 'extraction': 'glm-4-flash', 'simple_qa': 'glm-4-flash', // Complex reasoning 'reasoning': 'glm-4-plus', 'analysis': 'glm-4-plus', 'planning': 'glm-4-plus', // Long context 'codebase': 'glm-4-long', 'summarization': 'glm-4-long', // Default 'default': 'glm-4-plus' }; return modelMap[taskType] || modelMap.default; } ``` ## Fallback Logic ### When to Fallback from synthetic.new ```javascript async function callWithFallback(systemPrompt, userPrompt, options = {}) { const primaryResult = await callSyntheticAI(systemPrompt, userPrompt, options); // Check for fallback conditions if (primaryResult.error) { const errorCode = primaryResult.error.code; // Rate limit or server error - fallback to z.ai if ([429, 500, 502, 503, 504].includes(errorCode)) { console.log('Falling back to z.ai'); return await callZAI(systemPrompt, userPrompt, options); } } return primaryResult; } ``` ### GLM Superiority Conditions ```javascript function shouldPreferGLM(task) { const glmSuperiorTasks = [ 'chinese_translation', 'chinese_content', 'million_token_context', 'cost_optimization', 'new_model_testing' ]; return glmSuperiorTasks.includes(task.type); } ``` ## n8n Integration ### HTTP Request Node Configuration ```json { "method": "POST", "url": "https://api.z.ai/v1/chat/completions", "headers": { "Authorization": "Bearer {{ $env.Z_AI_API_KEY }}", "Content-Type": "application/json" }, "body": { "model": "glm-4-plus", "messages": [ { "role": "system", "content": "{{ systemPrompt }}" }, { "role": "user", "content": "{{ userPrompt }}" } ], "max_tokens": 4000, "temperature": 0.5 }, "timeout": 90000 } ``` ### Code Node Helper ```javascript // z.ai Request Helper for n8n Code Node async function callZAI(systemPrompt, userPrompt, options = {}) { const { model = 'glm-4-plus', maxTokens = 4000, temperature = 0.5 } = options; const response = await $http.request({ method: 'POST', url: 'https://api.z.ai/v1/chat/completions', headers: { 'Authorization': `Bearer ${$env.Z_AI_API_KEY}`, 'Content-Type': 'application/json' }, body: { model, messages: [ { role: 'system', content: systemPrompt }, { role: 'user', content: userPrompt } ], max_tokens: maxTokens, temperature } }); return response.choices[0].message.content; } ``` ## Comparison: synthetic.new vs z.ai | Feature | synthetic.new | z.ai | |---------|---------------|------| | Primary Use | All tasks | Fallback + GLM tasks | | Best Model (Code) | DeepSeek-V3 | GLM-4-Flash | | Best Model (Reasoning) | Kimi-K2-Thinking | GLM-4-Plus | | Max Context | 200K | 1M (GLM-4-Long) | | Chinese Support | Good | Excellent | | Rate Limit | 100/min | 60/min | | Cost | Standard | Lower (Flash) | ## Setup Instructions ### 1. Get API Key 1. Visit https://z.ai/dashboard 2. Create account or login 3. Navigate to API Keys 4. Generate new key 5. Store as `Z_AI_API_KEY` environment variable ### 2. Configure in Coolify ```bash # Add to service environment variables Z_AI_API_KEY=your_key_here ``` ### 3. Test Connection ```bash curl -X POST https://api.z.ai/v1/chat/completions \ -H "Authorization: Bearer $Z_AI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "glm-4-flash", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 50 }' ``` ## Error Handling | Error Code | Cause | Action | |------------|-------|--------| | 401 | Invalid API key | Check Z_AI_API_KEY | | 429 | Rate limit | Wait and retry | | 400 | Invalid model | Check model name | ## Related Skills - `ai-providers/synthetic-new.md` - Primary provider - `code/implement.md` - Code generation - `design-thinking/ideate.md` - Solution brainstorming ## Token Budget - Max input: 500 tokens - Max output: 800 tokens ## Model - Recommended: haiku (configuration lookup)