Compare commits

...

2 Commits

Author SHA1 Message Date
4c6ec6f10d Update z.ai skill with latest GLM models (Dec 2025)
- Add GLM-4.6 (flagship, 200K context, agentic)
- Add GLM-4.6V and GLM-4.6V-Flash (vision models)
- Add GLM-4.5, GLM-4.5-Air, GLM-4.5-Flash
- Add GLM-Z1-Rumination-32B (deep reasoning)
- Update model selection logic and pricing
- Add references to official documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 19:45:30 +01:00
d04a7439a8 Update z.ai skill with correct BigModel API URL
- Update base_url to https://open.bigmodel.cn/api/paas/v4
- Mark API key as configured

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 19:29:28 +01:00

View File

@@ -1,108 +1,208 @@
# Skill: AI Provider - z.ai # Skill: AI Provider - z.ai
## Description ## Description
Fallback AI provider with GLM (General Language Model) support. Use when synthetic.new is unavailable or when GLM models are superior for specific tasks. Fallback AI provider with GLM (General Language Model) support from Zhipu AI. Use when synthetic.new is unavailable or when GLM models are superior for specific tasks.
## Status ## Status
**FALLBACK** - Use when: **FALLBACK** - Use when:
1. synthetic.new rate limits or errors 1. synthetic.new rate limits or errors
2. GLM models outperform alternatives for the task 2. GLM models outperform alternatives for the task
3. New models available earlier on z.ai 3. New models available earlier on z.ai
4. Extended context (200K+) needed
5. Vision/multimodal tasks required
## Configuration ## Configuration
```yaml ```yaml
provider: z.ai provider: z.ai (Zhipu AI / BigModel)
base_url: https://api.z.ai/v1 base_url: https://open.bigmodel.cn/api/paas/v4
api_key_env: Z_AI_API_KEY api_key_env: Z_AI_API_KEY
compatibility: openai compatibility: openai
rate_limit: 60 requests/minute rate_limit: 60 requests/minute
``` ```
**Note:** API key needs to be configured. Check z.ai dashboard for key generation. **API Key configured:** `Z_AI_API_KEY` in environment variables.
## Available Models ## Available Models (Updated Dec 2025)
### GLM-4-Plus (Reasoning & Analysis) ### GLM-4.6 (Flagship - Latest)
```json ```json
{ {
"model_id": "glm-4-plus", "model_id": "glm-4.6",
"best_for": ["reasoning", "analysis", "chinese_language", "long_context"], "best_for": ["agentic", "reasoning", "coding", "frontend_dev"],
"context_window": 202752,
"max_output": 128000,
"temperature_range": [0.0, 1.0],
"recommended_temp": 0.5,
"strengths": ["200K context", "Tool use", "Agent workflows", "15% more token efficient"],
"pricing": { "input": "$0.40/M", "output": "$1.75/M" },
"released": "2025-09-30"
}
```
**Use when:**
- Complex agentic tasks
- Advanced reasoning
- Frontend/UI development
- Tool-calling workflows
- Extended context needs (200K)
### GLM-4.6V (Vision - Latest)
```json
{
"model_id": "glm-4.6v",
"best_for": ["image_analysis", "multimodal", "document_processing", "video_understanding"],
"context_window": 128000,
"max_output": 4096,
"temperature_range": [0.0, 1.0],
"recommended_temp": 0.3,
"strengths": ["Native tool calling", "150 pages/1hr video input", "SOTA vision understanding"],
"parameters": "106B",
"released": "2025-12-08"
}
```
**Use when:**
- Image analysis and understanding
- Document OCR and processing
- Video content analysis
- Multimodal reasoning
### GLM-4.6V-Flash (Vision - Lightweight)
```json
{
"model_id": "glm-4.6v-flash",
"best_for": ["fast_image_analysis", "local_deployment", "low_latency"],
"context_window": 128000,
"max_output": 4096,
"temperature_range": [0.0, 1.0],
"recommended_temp": 0.3,
"strengths": ["9B parameters", "Fast inference", "Local deployable"],
"parameters": "9B",
"released": "2025-12-08"
}
```
**Use when:**
- Quick image classification
- Edge/local deployment
- Low-latency vision tasks
### GLM-4.5 (Previous Flagship)
```json
{
"model_id": "glm-4.5",
"best_for": ["reasoning", "tool_use", "coding", "agents"],
"context_window": 128000, "context_window": 128000,
"max_output": 4096, "max_output": 4096,
"temperature_range": [0.0, 1.0], "temperature_range": [0.0, 1.0],
"recommended_temp": 0.5, "recommended_temp": 0.5,
"strengths": ["Chinese content", "Logical reasoning", "Long documents"] "strengths": ["355B MoE", "32B active params", "Proven stability"],
"parameters": "355B (32B active)"
} }
``` ```
**Use when:** **Use when:**
- Complex logical reasoning - Need proven stable model
- Chinese language content - Standard reasoning tasks
- Long document analysis - Backward compatibility
- Comparative analysis
### GLM-4-Flash (Fast Responses) ### GLM-4.5-Air (Efficient)
```json ```json
{ {
"model_id": "glm-4-flash", "model_id": "glm-4.5-air",
"best_for": ["quick_responses", "simple_tasks", "high_volume"], "best_for": ["cost_efficient", "standard_tasks", "high_volume"],
"context_window": 128000,
"max_output": 4096,
"temperature_range": [0.0, 1.0],
"recommended_temp": 0.5,
"strengths": ["106B MoE", "12B active params", "Cost effective"],
"parameters": "106B (12B active)"
}
```
**Use when:**
- Cost-sensitive operations
- High-volume processing
- Standard quality acceptable
### GLM-4.5-Flash (Fast)
```json
{
"model_id": "glm-4.5-flash",
"best_for": ["ultra_fast", "simple_tasks", "streaming"],
"context_window": 32000, "context_window": 32000,
"max_output": 2048, "max_output": 2048,
"temperature_range": [0.0, 1.0], "temperature_range": [0.0, 1.0],
"recommended_temp": 0.3, "recommended_temp": 0.3,
"strengths": ["Speed", "Cost efficiency", "Simple tasks"] "strengths": ["Fastest inference", "Lowest cost", "Simple tasks"]
} }
``` ```
**Use when:** **Use when:**
- Quick classification - Real-time responses needed
- Simple transformations - Simple classification/extraction
- High-volume processing - Budget constraints
- Cost-sensitive operations
### GLM-4-Long (Extended Context) ### GLM-Z1-Rumination-32B (Deep Reasoning)
```json ```json
{ {
"model_id": "glm-4-long", "model_id": "glm-z1-rumination-32b-0414",
"best_for": ["long_documents", "codebase_analysis", "summarization"], "best_for": ["deep_reasoning", "complex_analysis", "deliberation"],
"context_window": 1000000, "context_window": 128000,
"max_output": 4096, "max_output": 4096,
"temperature_range": [0.0, 1.0], "temperature_range": [0.0, 1.0],
"recommended_temp": 0.3, "recommended_temp": 0.7,
"strengths": ["1M token context", "Document processing", "Code analysis"] "strengths": ["Rumination capability", "Step-by-step reasoning", "Complex problems"],
"released": "2025-04-14"
} }
``` ```
**Use when:** **Use when:**
- Entire codebase analysis - Complex multi-step reasoning
- Long document summarization - Problems requiring deliberation
- Multi-file code review - Chain-of-thought tasks
## Model Selection Logic ## Model Selection Logic
```javascript ```javascript
function selectZAIModel(taskType, contextLength) { function selectZAIModel(taskType, contextLength, needsVision = false) {
// Vision tasks
if (needsVision) {
return contextLength > 64000 ? 'glm-4.6v' : 'glm-4.6v-flash';
}
// Context-based selection // Context-based selection
if (contextLength > 128000) { if (contextLength > 128000) {
return 'glm-4-long'; return 'glm-4.6'; // 200K context
} }
const modelMap = { const modelMap = {
// Flagship tasks
'agentic': 'glm-4.6',
'frontend': 'glm-4.6',
'tool_use': 'glm-4.6',
// Deep reasoning
'deep_reasoning': 'glm-z1-rumination-32b-0414',
'deliberation': 'glm-z1-rumination-32b-0414',
// Standard reasoning
'reasoning': 'glm-4.5',
'analysis': 'glm-4.5',
'planning': 'glm-4.5',
'coding': 'glm-4.5',
// Cost-efficient
'cost_efficient': 'glm-4.5-air',
'high_volume': 'glm-4.5-air',
// Fast operations // Fast operations
'classification': 'glm-4-flash', 'classification': 'glm-4.5-flash',
'extraction': 'glm-4-flash', 'extraction': 'glm-4.5-flash',
'simple_qa': 'glm-4-flash', 'simple_qa': 'glm-4.5-flash',
'streaming': 'glm-4.5-flash',
// Complex reasoning // Default to flagship
'reasoning': 'glm-4-plus', 'default': 'glm-4.6'
'analysis': 'glm-4-plus',
'planning': 'glm-4-plus',
// Long context
'codebase': 'glm-4-long',
'summarization': 'glm-4-long',
// Default
'default': 'glm-4-plus'
}; };
return modelMap[taskType] || modelMap.default; return modelMap[taskType] || modelMap.default;
@@ -122,7 +222,7 @@ async function callWithFallback(systemPrompt, userPrompt, options = {}) {
// Rate limit or server error - fallback to z.ai // Rate limit or server error - fallback to z.ai
if ([429, 500, 502, 503, 504].includes(errorCode)) { if ([429, 500, 502, 503, 504].includes(errorCode)) {
console.log('Falling back to z.ai'); console.log('Falling back to z.ai GLM-4.6');
return await callZAI(systemPrompt, userPrompt, options); return await callZAI(systemPrompt, userPrompt, options);
} }
} }
@@ -137,9 +237,12 @@ function shouldPreferGLM(task) {
const glmSuperiorTasks = [ const glmSuperiorTasks = [
'chinese_translation', 'chinese_translation',
'chinese_content', 'chinese_content',
'million_token_context', 'extended_context_200k',
'cost_optimization', 'vision_analysis',
'new_model_testing' 'multimodal',
'frontend_development',
'deep_rumination',
'cost_optimization'
]; ];
return glmSuperiorTasks.includes(task.type); return glmSuperiorTasks.includes(task.type);
@@ -152,13 +255,13 @@ function shouldPreferGLM(task) {
```json ```json
{ {
"method": "POST", "method": "POST",
"url": "https://api.z.ai/v1/chat/completions", "url": "https://open.bigmodel.cn/api/paas/v4/chat/completions",
"headers": { "headers": {
"Authorization": "Bearer {{ $env.Z_AI_API_KEY }}", "Authorization": "Bearer {{ $env.Z_AI_API_KEY }}",
"Content-Type": "application/json" "Content-Type": "application/json"
}, },
"body": { "body": {
"model": "glm-4-plus", "model": "glm-4.6",
"messages": [ "messages": [
{ "role": "system", "content": "{{ systemPrompt }}" }, { "role": "system", "content": "{{ systemPrompt }}" },
{ "role": "user", "content": "{{ userPrompt }}" } { "role": "user", "content": "{{ userPrompt }}" }
@@ -175,14 +278,14 @@ function shouldPreferGLM(task) {
// z.ai Request Helper for n8n Code Node // z.ai Request Helper for n8n Code Node
async function callZAI(systemPrompt, userPrompt, options = {}) { async function callZAI(systemPrompt, userPrompt, options = {}) {
const { const {
model = 'glm-4-plus', model = 'glm-4.6',
maxTokens = 4000, maxTokens = 4000,
temperature = 0.5 temperature = 0.5
} = options; } = options;
const response = await $http.request({ const response = await $http.request({
method: 'POST', method: 'POST',
url: 'https://api.z.ai/v1/chat/completions', url: 'https://open.bigmodel.cn/api/paas/v4/chat/completions',
headers: { headers: {
'Authorization': `Bearer ${$env.Z_AI_API_KEY}`, 'Authorization': `Bearer ${$env.Z_AI_API_KEY}`,
'Content-Type': 'application/json' 'Content-Type': 'application/json'
@@ -207,17 +310,35 @@ async function callZAI(systemPrompt, userPrompt, options = {}) {
| Feature | synthetic.new | z.ai | | Feature | synthetic.new | z.ai |
|---------|---------------|------| |---------|---------------|------|
| Primary Use | All tasks | Fallback + GLM tasks | | Primary Use | All tasks | Fallback + GLM tasks |
| Best Model (Code) | DeepSeek-V3 | GLM-4-Flash | | Best Model (Code) | DeepSeek-V3 | GLM-4.6 |
| Best Model (Reasoning) | Kimi-K2-Thinking | GLM-4-Plus | | Best Model (Reasoning) | Kimi-K2-Thinking | GLM-Z1-Rumination |
| Max Context | 200K | 1M (GLM-4-Long) | | Best Model (Vision) | N/A | GLM-4.6V |
| Max Context | 200K | 200K (GLM-4.6) |
| Chinese Support | Good | Excellent | | Chinese Support | Good | Excellent |
| Rate Limit | 100/min | 60/min | | Rate Limit | 100/min | 60/min |
| Cost | Standard | Lower (Flash) | | Cost (Input) | ~$0.50/M | $0.40/M (GLM-4.6) |
| Open Source | No | Yes (MIT) |
## Model Hierarchy (Recommended)
```
Task Complexity:
HIGH → GLM-Z1-Rumination (deep reasoning)
→ GLM-4.6 (agentic, coding)
→ GLM-4.6V (vision tasks)
MEDIUM → GLM-4.5 (standard tasks)
→ GLM-4.5-Air (cost-efficient)
LOW → GLM-4.5-Flash (fast, simple)
→ GLM-4.6V-Flash (fast vision)
```
## Setup Instructions ## Setup Instructions
### 1. Get API Key ### 1. Get API Key
1. Visit https://z.ai/dashboard 1. Visit https://z.ai/dashboard or https://open.bigmodel.cn
2. Create account or login 2. Create account or login
3. Navigate to API Keys 3. Navigate to API Keys
4. Generate new key 4. Generate new key
@@ -226,16 +347,16 @@ async function callZAI(systemPrompt, userPrompt, options = {}) {
### 2. Configure in Coolify ### 2. Configure in Coolify
```bash ```bash
# Add to service environment variables # Add to service environment variables
Z_AI_API_KEY=your_key_here Z_AI_API_KEY=60d1f6bb3ef74aa7a42680dd85f5ac4b.hxa0gtYtoHfBRI62
``` ```
### 3. Test Connection ### 3. Test Connection
```bash ```bash
curl -X POST https://api.z.ai/v1/chat/completions \ curl -X POST https://open.bigmodel.cn/api/paas/v4/chat/completions \
-H "Authorization: Bearer $Z_AI_API_KEY" \ -H "Authorization: Bearer $Z_AI_API_KEY" \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
"model": "glm-4-flash", "model": "glm-4.6",
"messages": [{"role": "user", "content": "Hello"}], "messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 50 "max_tokens": 50
}' }'
@@ -248,6 +369,12 @@ curl -X POST https://api.z.ai/v1/chat/completions \
| 429 | Rate limit | Wait and retry | | 429 | Rate limit | Wait and retry |
| 400 | Invalid model | Check model name | | 400 | Invalid model | Check model name |
## References
- [GLM-4.6 Announcement](https://z.ai/blog/glm-4.6)
- [GLM-4.6V Multimodal](https://z.ai/blog/glm-4.6v)
- [OpenRouter GLM-4.6](https://openrouter.ai/z-ai/glm-4.6)
- [Hugging Face Models](https://huggingface.co/zai-org)
## Related Skills ## Related Skills
- `ai-providers/synthetic-new.md` - Primary provider - `ai-providers/synthetic-new.md` - Primary provider
- `code/implement.md` - Code generation - `code/implement.md` - Code generation