Compare commits
2 Commits
4debcf00de
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 4c6ec6f10d | |||
| d04a7439a8 |
@@ -1,108 +1,208 @@
|
|||||||
# Skill: AI Provider - z.ai
|
# Skill: AI Provider - z.ai
|
||||||
|
|
||||||
## Description
|
## Description
|
||||||
Fallback AI provider with GLM (General Language Model) support. Use when synthetic.new is unavailable or when GLM models are superior for specific tasks.
|
Fallback AI provider with GLM (General Language Model) support from Zhipu AI. Use when synthetic.new is unavailable or when GLM models are superior for specific tasks.
|
||||||
|
|
||||||
## Status
|
## Status
|
||||||
**FALLBACK** - Use when:
|
**FALLBACK** - Use when:
|
||||||
1. synthetic.new rate limits or errors
|
1. synthetic.new rate limits or errors
|
||||||
2. GLM models outperform alternatives for the task
|
2. GLM models outperform alternatives for the task
|
||||||
3. New models available earlier on z.ai
|
3. New models available earlier on z.ai
|
||||||
|
4. Extended context (200K+) needed
|
||||||
|
5. Vision/multimodal tasks required
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
```yaml
|
```yaml
|
||||||
provider: z.ai
|
provider: z.ai (Zhipu AI / BigModel)
|
||||||
base_url: https://api.z.ai/v1
|
base_url: https://open.bigmodel.cn/api/paas/v4
|
||||||
api_key_env: Z_AI_API_KEY
|
api_key_env: Z_AI_API_KEY
|
||||||
compatibility: openai
|
compatibility: openai
|
||||||
rate_limit: 60 requests/minute
|
rate_limit: 60 requests/minute
|
||||||
```
|
```
|
||||||
|
|
||||||
**Note:** API key needs to be configured. Check z.ai dashboard for key generation.
|
**API Key configured:** `Z_AI_API_KEY` in environment variables.
|
||||||
|
|
||||||
## Available Models
|
## Available Models (Updated Dec 2025)
|
||||||
|
|
||||||
### GLM-4-Plus (Reasoning & Analysis)
|
### GLM-4.6 (Flagship - Latest)
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"model_id": "glm-4-plus",
|
"model_id": "glm-4.6",
|
||||||
"best_for": ["reasoning", "analysis", "chinese_language", "long_context"],
|
"best_for": ["agentic", "reasoning", "coding", "frontend_dev"],
|
||||||
|
"context_window": 202752,
|
||||||
|
"max_output": 128000,
|
||||||
|
"temperature_range": [0.0, 1.0],
|
||||||
|
"recommended_temp": 0.5,
|
||||||
|
"strengths": ["200K context", "Tool use", "Agent workflows", "15% more token efficient"],
|
||||||
|
"pricing": { "input": "$0.40/M", "output": "$1.75/M" },
|
||||||
|
"released": "2025-09-30"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use when:**
|
||||||
|
- Complex agentic tasks
|
||||||
|
- Advanced reasoning
|
||||||
|
- Frontend/UI development
|
||||||
|
- Tool-calling workflows
|
||||||
|
- Extended context needs (200K)
|
||||||
|
|
||||||
|
### GLM-4.6V (Vision - Latest)
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model_id": "glm-4.6v",
|
||||||
|
"best_for": ["image_analysis", "multimodal", "document_processing", "video_understanding"],
|
||||||
|
"context_window": 128000,
|
||||||
|
"max_output": 4096,
|
||||||
|
"temperature_range": [0.0, 1.0],
|
||||||
|
"recommended_temp": 0.3,
|
||||||
|
"strengths": ["Native tool calling", "150 pages/1hr video input", "SOTA vision understanding"],
|
||||||
|
"parameters": "106B",
|
||||||
|
"released": "2025-12-08"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use when:**
|
||||||
|
- Image analysis and understanding
|
||||||
|
- Document OCR and processing
|
||||||
|
- Video content analysis
|
||||||
|
- Multimodal reasoning
|
||||||
|
|
||||||
|
### GLM-4.6V-Flash (Vision - Lightweight)
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model_id": "glm-4.6v-flash",
|
||||||
|
"best_for": ["fast_image_analysis", "local_deployment", "low_latency"],
|
||||||
|
"context_window": 128000,
|
||||||
|
"max_output": 4096,
|
||||||
|
"temperature_range": [0.0, 1.0],
|
||||||
|
"recommended_temp": 0.3,
|
||||||
|
"strengths": ["9B parameters", "Fast inference", "Local deployable"],
|
||||||
|
"parameters": "9B",
|
||||||
|
"released": "2025-12-08"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use when:**
|
||||||
|
- Quick image classification
|
||||||
|
- Edge/local deployment
|
||||||
|
- Low-latency vision tasks
|
||||||
|
|
||||||
|
### GLM-4.5 (Previous Flagship)
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model_id": "glm-4.5",
|
||||||
|
"best_for": ["reasoning", "tool_use", "coding", "agents"],
|
||||||
"context_window": 128000,
|
"context_window": 128000,
|
||||||
"max_output": 4096,
|
"max_output": 4096,
|
||||||
"temperature_range": [0.0, 1.0],
|
"temperature_range": [0.0, 1.0],
|
||||||
"recommended_temp": 0.5,
|
"recommended_temp": 0.5,
|
||||||
"strengths": ["Chinese content", "Logical reasoning", "Long documents"]
|
"strengths": ["355B MoE", "32B active params", "Proven stability"],
|
||||||
|
"parameters": "355B (32B active)"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
**Use when:**
|
**Use when:**
|
||||||
- Complex logical reasoning
|
- Need proven stable model
|
||||||
- Chinese language content
|
- Standard reasoning tasks
|
||||||
- Long document analysis
|
- Backward compatibility
|
||||||
- Comparative analysis
|
|
||||||
|
|
||||||
### GLM-4-Flash (Fast Responses)
|
### GLM-4.5-Air (Efficient)
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"model_id": "glm-4-flash",
|
"model_id": "glm-4.5-air",
|
||||||
"best_for": ["quick_responses", "simple_tasks", "high_volume"],
|
"best_for": ["cost_efficient", "standard_tasks", "high_volume"],
|
||||||
|
"context_window": 128000,
|
||||||
|
"max_output": 4096,
|
||||||
|
"temperature_range": [0.0, 1.0],
|
||||||
|
"recommended_temp": 0.5,
|
||||||
|
"strengths": ["106B MoE", "12B active params", "Cost effective"],
|
||||||
|
"parameters": "106B (12B active)"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use when:**
|
||||||
|
- Cost-sensitive operations
|
||||||
|
- High-volume processing
|
||||||
|
- Standard quality acceptable
|
||||||
|
|
||||||
|
### GLM-4.5-Flash (Fast)
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model_id": "glm-4.5-flash",
|
||||||
|
"best_for": ["ultra_fast", "simple_tasks", "streaming"],
|
||||||
"context_window": 32000,
|
"context_window": 32000,
|
||||||
"max_output": 2048,
|
"max_output": 2048,
|
||||||
"temperature_range": [0.0, 1.0],
|
"temperature_range": [0.0, 1.0],
|
||||||
"recommended_temp": 0.3,
|
"recommended_temp": 0.3,
|
||||||
"strengths": ["Speed", "Cost efficiency", "Simple tasks"]
|
"strengths": ["Fastest inference", "Lowest cost", "Simple tasks"]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
**Use when:**
|
**Use when:**
|
||||||
- Quick classification
|
- Real-time responses needed
|
||||||
- Simple transformations
|
- Simple classification/extraction
|
||||||
- High-volume processing
|
- Budget constraints
|
||||||
- Cost-sensitive operations
|
|
||||||
|
|
||||||
### GLM-4-Long (Extended Context)
|
### GLM-Z1-Rumination-32B (Deep Reasoning)
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"model_id": "glm-4-long",
|
"model_id": "glm-z1-rumination-32b-0414",
|
||||||
"best_for": ["long_documents", "codebase_analysis", "summarization"],
|
"best_for": ["deep_reasoning", "complex_analysis", "deliberation"],
|
||||||
"context_window": 1000000,
|
"context_window": 128000,
|
||||||
"max_output": 4096,
|
"max_output": 4096,
|
||||||
"temperature_range": [0.0, 1.0],
|
"temperature_range": [0.0, 1.0],
|
||||||
"recommended_temp": 0.3,
|
"recommended_temp": 0.7,
|
||||||
"strengths": ["1M token context", "Document processing", "Code analysis"]
|
"strengths": ["Rumination capability", "Step-by-step reasoning", "Complex problems"],
|
||||||
|
"released": "2025-04-14"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
**Use when:**
|
**Use when:**
|
||||||
- Entire codebase analysis
|
- Complex multi-step reasoning
|
||||||
- Long document summarization
|
- Problems requiring deliberation
|
||||||
- Multi-file code review
|
- Chain-of-thought tasks
|
||||||
|
|
||||||
## Model Selection Logic
|
## Model Selection Logic
|
||||||
```javascript
|
```javascript
|
||||||
function selectZAIModel(taskType, contextLength) {
|
function selectZAIModel(taskType, contextLength, needsVision = false) {
|
||||||
|
// Vision tasks
|
||||||
|
if (needsVision) {
|
||||||
|
return contextLength > 64000 ? 'glm-4.6v' : 'glm-4.6v-flash';
|
||||||
|
}
|
||||||
|
|
||||||
// Context-based selection
|
// Context-based selection
|
||||||
if (contextLength > 128000) {
|
if (contextLength > 128000) {
|
||||||
return 'glm-4-long';
|
return 'glm-4.6'; // 200K context
|
||||||
}
|
}
|
||||||
|
|
||||||
const modelMap = {
|
const modelMap = {
|
||||||
|
// Flagship tasks
|
||||||
|
'agentic': 'glm-4.6',
|
||||||
|
'frontend': 'glm-4.6',
|
||||||
|
'tool_use': 'glm-4.6',
|
||||||
|
|
||||||
|
// Deep reasoning
|
||||||
|
'deep_reasoning': 'glm-z1-rumination-32b-0414',
|
||||||
|
'deliberation': 'glm-z1-rumination-32b-0414',
|
||||||
|
|
||||||
|
// Standard reasoning
|
||||||
|
'reasoning': 'glm-4.5',
|
||||||
|
'analysis': 'glm-4.5',
|
||||||
|
'planning': 'glm-4.5',
|
||||||
|
'coding': 'glm-4.5',
|
||||||
|
|
||||||
|
// Cost-efficient
|
||||||
|
'cost_efficient': 'glm-4.5-air',
|
||||||
|
'high_volume': 'glm-4.5-air',
|
||||||
|
|
||||||
// Fast operations
|
// Fast operations
|
||||||
'classification': 'glm-4-flash',
|
'classification': 'glm-4.5-flash',
|
||||||
'extraction': 'glm-4-flash',
|
'extraction': 'glm-4.5-flash',
|
||||||
'simple_qa': 'glm-4-flash',
|
'simple_qa': 'glm-4.5-flash',
|
||||||
|
'streaming': 'glm-4.5-flash',
|
||||||
|
|
||||||
// Complex reasoning
|
// Default to flagship
|
||||||
'reasoning': 'glm-4-plus',
|
'default': 'glm-4.6'
|
||||||
'analysis': 'glm-4-plus',
|
|
||||||
'planning': 'glm-4-plus',
|
|
||||||
|
|
||||||
// Long context
|
|
||||||
'codebase': 'glm-4-long',
|
|
||||||
'summarization': 'glm-4-long',
|
|
||||||
|
|
||||||
// Default
|
|
||||||
'default': 'glm-4-plus'
|
|
||||||
};
|
};
|
||||||
|
|
||||||
return modelMap[taskType] || modelMap.default;
|
return modelMap[taskType] || modelMap.default;
|
||||||
@@ -122,7 +222,7 @@ async function callWithFallback(systemPrompt, userPrompt, options = {}) {
|
|||||||
|
|
||||||
// Rate limit or server error - fallback to z.ai
|
// Rate limit or server error - fallback to z.ai
|
||||||
if ([429, 500, 502, 503, 504].includes(errorCode)) {
|
if ([429, 500, 502, 503, 504].includes(errorCode)) {
|
||||||
console.log('Falling back to z.ai');
|
console.log('Falling back to z.ai GLM-4.6');
|
||||||
return await callZAI(systemPrompt, userPrompt, options);
|
return await callZAI(systemPrompt, userPrompt, options);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -137,9 +237,12 @@ function shouldPreferGLM(task) {
|
|||||||
const glmSuperiorTasks = [
|
const glmSuperiorTasks = [
|
||||||
'chinese_translation',
|
'chinese_translation',
|
||||||
'chinese_content',
|
'chinese_content',
|
||||||
'million_token_context',
|
'extended_context_200k',
|
||||||
'cost_optimization',
|
'vision_analysis',
|
||||||
'new_model_testing'
|
'multimodal',
|
||||||
|
'frontend_development',
|
||||||
|
'deep_rumination',
|
||||||
|
'cost_optimization'
|
||||||
];
|
];
|
||||||
|
|
||||||
return glmSuperiorTasks.includes(task.type);
|
return glmSuperiorTasks.includes(task.type);
|
||||||
@@ -152,13 +255,13 @@ function shouldPreferGLM(task) {
|
|||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"method": "POST",
|
"method": "POST",
|
||||||
"url": "https://api.z.ai/v1/chat/completions",
|
"url": "https://open.bigmodel.cn/api/paas/v4/chat/completions",
|
||||||
"headers": {
|
"headers": {
|
||||||
"Authorization": "Bearer {{ $env.Z_AI_API_KEY }}",
|
"Authorization": "Bearer {{ $env.Z_AI_API_KEY }}",
|
||||||
"Content-Type": "application/json"
|
"Content-Type": "application/json"
|
||||||
},
|
},
|
||||||
"body": {
|
"body": {
|
||||||
"model": "glm-4-plus",
|
"model": "glm-4.6",
|
||||||
"messages": [
|
"messages": [
|
||||||
{ "role": "system", "content": "{{ systemPrompt }}" },
|
{ "role": "system", "content": "{{ systemPrompt }}" },
|
||||||
{ "role": "user", "content": "{{ userPrompt }}" }
|
{ "role": "user", "content": "{{ userPrompt }}" }
|
||||||
@@ -175,14 +278,14 @@ function shouldPreferGLM(task) {
|
|||||||
// z.ai Request Helper for n8n Code Node
|
// z.ai Request Helper for n8n Code Node
|
||||||
async function callZAI(systemPrompt, userPrompt, options = {}) {
|
async function callZAI(systemPrompt, userPrompt, options = {}) {
|
||||||
const {
|
const {
|
||||||
model = 'glm-4-plus',
|
model = 'glm-4.6',
|
||||||
maxTokens = 4000,
|
maxTokens = 4000,
|
||||||
temperature = 0.5
|
temperature = 0.5
|
||||||
} = options;
|
} = options;
|
||||||
|
|
||||||
const response = await $http.request({
|
const response = await $http.request({
|
||||||
method: 'POST',
|
method: 'POST',
|
||||||
url: 'https://api.z.ai/v1/chat/completions',
|
url: 'https://open.bigmodel.cn/api/paas/v4/chat/completions',
|
||||||
headers: {
|
headers: {
|
||||||
'Authorization': `Bearer ${$env.Z_AI_API_KEY}`,
|
'Authorization': `Bearer ${$env.Z_AI_API_KEY}`,
|
||||||
'Content-Type': 'application/json'
|
'Content-Type': 'application/json'
|
||||||
@@ -207,17 +310,35 @@ async function callZAI(systemPrompt, userPrompt, options = {}) {
|
|||||||
| Feature | synthetic.new | z.ai |
|
| Feature | synthetic.new | z.ai |
|
||||||
|---------|---------------|------|
|
|---------|---------------|------|
|
||||||
| Primary Use | All tasks | Fallback + GLM tasks |
|
| Primary Use | All tasks | Fallback + GLM tasks |
|
||||||
| Best Model (Code) | DeepSeek-V3 | GLM-4-Flash |
|
| Best Model (Code) | DeepSeek-V3 | GLM-4.6 |
|
||||||
| Best Model (Reasoning) | Kimi-K2-Thinking | GLM-4-Plus |
|
| Best Model (Reasoning) | Kimi-K2-Thinking | GLM-Z1-Rumination |
|
||||||
| Max Context | 200K | 1M (GLM-4-Long) |
|
| Best Model (Vision) | N/A | GLM-4.6V |
|
||||||
|
| Max Context | 200K | 200K (GLM-4.6) |
|
||||||
| Chinese Support | Good | Excellent |
|
| Chinese Support | Good | Excellent |
|
||||||
| Rate Limit | 100/min | 60/min |
|
| Rate Limit | 100/min | 60/min |
|
||||||
| Cost | Standard | Lower (Flash) |
|
| Cost (Input) | ~$0.50/M | $0.40/M (GLM-4.6) |
|
||||||
|
| Open Source | No | Yes (MIT) |
|
||||||
|
|
||||||
|
## Model Hierarchy (Recommended)
|
||||||
|
|
||||||
|
```
|
||||||
|
Task Complexity:
|
||||||
|
|
||||||
|
HIGH → GLM-Z1-Rumination (deep reasoning)
|
||||||
|
→ GLM-4.6 (agentic, coding)
|
||||||
|
→ GLM-4.6V (vision tasks)
|
||||||
|
|
||||||
|
MEDIUM → GLM-4.5 (standard tasks)
|
||||||
|
→ GLM-4.5-Air (cost-efficient)
|
||||||
|
|
||||||
|
LOW → GLM-4.5-Flash (fast, simple)
|
||||||
|
→ GLM-4.6V-Flash (fast vision)
|
||||||
|
```
|
||||||
|
|
||||||
## Setup Instructions
|
## Setup Instructions
|
||||||
|
|
||||||
### 1. Get API Key
|
### 1. Get API Key
|
||||||
1. Visit https://z.ai/dashboard
|
1. Visit https://z.ai/dashboard or https://open.bigmodel.cn
|
||||||
2. Create account or login
|
2. Create account or login
|
||||||
3. Navigate to API Keys
|
3. Navigate to API Keys
|
||||||
4. Generate new key
|
4. Generate new key
|
||||||
@@ -226,16 +347,16 @@ async function callZAI(systemPrompt, userPrompt, options = {}) {
|
|||||||
### 2. Configure in Coolify
|
### 2. Configure in Coolify
|
||||||
```bash
|
```bash
|
||||||
# Add to service environment variables
|
# Add to service environment variables
|
||||||
Z_AI_API_KEY=your_key_here
|
Z_AI_API_KEY=60d1f6bb3ef74aa7a42680dd85f5ac4b.hxa0gtYtoHfBRI62
|
||||||
```
|
```
|
||||||
|
|
||||||
### 3. Test Connection
|
### 3. Test Connection
|
||||||
```bash
|
```bash
|
||||||
curl -X POST https://api.z.ai/v1/chat/completions \
|
curl -X POST https://open.bigmodel.cn/api/paas/v4/chat/completions \
|
||||||
-H "Authorization: Bearer $Z_AI_API_KEY" \
|
-H "Authorization: Bearer $Z_AI_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{
|
-d '{
|
||||||
"model": "glm-4-flash",
|
"model": "glm-4.6",
|
||||||
"messages": [{"role": "user", "content": "Hello"}],
|
"messages": [{"role": "user", "content": "Hello"}],
|
||||||
"max_tokens": 50
|
"max_tokens": 50
|
||||||
}'
|
}'
|
||||||
@@ -248,6 +369,12 @@ curl -X POST https://api.z.ai/v1/chat/completions \
|
|||||||
| 429 | Rate limit | Wait and retry |
|
| 429 | Rate limit | Wait and retry |
|
||||||
| 400 | Invalid model | Check model name |
|
| 400 | Invalid model | Check model name |
|
||||||
|
|
||||||
|
## References
|
||||||
|
- [GLM-4.6 Announcement](https://z.ai/blog/glm-4.6)
|
||||||
|
- [GLM-4.6V Multimodal](https://z.ai/blog/glm-4.6v)
|
||||||
|
- [OpenRouter GLM-4.6](https://openrouter.ai/z-ai/glm-4.6)
|
||||||
|
- [Hugging Face Models](https://huggingface.co/zai-org)
|
||||||
|
|
||||||
## Related Skills
|
## Related Skills
|
||||||
- `ai-providers/synthetic-new.md` - Primary provider
|
- `ai-providers/synthetic-new.md` - Primary provider
|
||||||
- `code/implement.md` - Code generation
|
- `code/implement.md` - Code generation
|
||||||
|
|||||||
Reference in New Issue
Block a user