Skip to content

Supported Models

UniCraft supports a wide range of AI models from multiple providers, allowing you to choose the best model for your specific use case.

Models designed for conversational AI and chat applications.

Models for text completion and generation tasks.

Models for converting text into vector representations.

Models that can process both text and images.

  • Model ID: gpt-4
  • Context Length: 8,192 tokens
  • Best For: Complex reasoning, analysis, creative writing
  • Pricing: $0.03/1K input, $0.06/1K output
  • Availability: High
  • Model ID: gpt-4-turbo
  • Context Length: 128,000 tokens
  • Best For: Large context tasks, document analysis
  • Pricing: $0.01/1K input, $0.03/1K output
  • Availability: High
  • Model ID: gpt-4-vision-preview
  • Context Length: 128,000 tokens
  • Best For: Image analysis, multimodal tasks
  • Pricing: $0.01/1K input, $0.03/1K output
  • Availability: Medium
  • Model ID: gpt-3.5-turbo
  • Context Length: 4,096 tokens
  • Best For: General-purpose tasks, cost-effective
  • Pricing: $0.001/1K input, $0.002/1K output
  • Availability: High
  • Model ID: gpt-3.5-turbo-16k
  • Context Length: 16,384 tokens
  • Best For: Longer conversations, document processing
  • Pricing: $0.003/1K input, $0.004/1K output
  • Availability: High
  • Model ID: text-embedding-ada-002
  • Dimensions: 1,536
  • Best For: General-purpose embeddings
  • Pricing: $0.0001/1K tokens
  • Availability: High
  • Model ID: text-embedding-3-small
  • Dimensions: 1,536
  • Best For: Faster, cost-effective embeddings
  • Pricing: $0.00002/1K tokens
  • Availability: High
  • Model ID: text-embedding-3-large
  • Dimensions: 3,072
  • Best For: High-quality embeddings
  • Pricing: $0.00013/1K tokens
  • Availability: High
  • Model ID: claude-3-opus-20240229
  • Context Length: 200,000 tokens
  • Best For: Complex reasoning, analysis, creative tasks
  • Pricing: $0.015/1K input, $0.075/1K output
  • Availability: High
  • Model ID: claude-3-sonnet-20240229
  • Context Length: 200,000 tokens
  • Best For: Balanced performance and cost
  • Pricing: $0.003/1K input, $0.015/1K output
  • Availability: High
  • Model ID: claude-3-haiku-20240307
  • Context Length: 200,000 tokens
  • Best For: Fast, cost-effective responses
  • Pricing: $0.00025/1K input, $0.00125/1K output
  • Availability: High
  • Model ID: gemini-pro
  • Context Length: 32,000 tokens
  • Best For: General-purpose tasks, multilingual
  • Pricing: $0.0005/1K input, $0.0015/1K output
  • Availability: High
  • Model ID: gemini-pro-vision
  • Context Length: 16,000 tokens
  • Best For: Image analysis, multimodal tasks
  • Pricing: $0.0005/1K input, $0.0015/1K output
  • Availability: Medium
  • Model ID: palm-2-text-bison
  • Context Length: 8,192 tokens
  • Best For: Text generation, summarization
  • Pricing: $0.0005/1K input, $0.0015/1K output
  • Availability: High
  • Model ID: palm-2-chat-bison
  • Context Length: 8,192 tokens
  • Best For: Conversational AI, chat applications
  • Pricing: $0.0005/1K input, $0.0015/1K output
  • Availability: High
  • Model ID: command
  • Context Length: 4,096 tokens
  • Best For: General-purpose tasks, enterprise use
  • Pricing: $0.001/1K input, $0.002/1K output
  • Availability: High
  • Model ID: command-light
  • Context Length: 4,096 tokens
  • Best For: Fast, cost-effective responses
  • Pricing: $0.0005/1K input, $0.001/1K output
  • Availability: High
  • Model ID: embed-english-v2.0
  • Dimensions: 1,024
  • Best For: English text embeddings
  • Pricing: $0.0001/1K tokens
  • Availability: High
  • Model ID: embed-multilingual-v2.0
  • Dimensions: 1,024
  • Best For: Multilingual text embeddings
  • Pricing: $0.0001/1K tokens
  • Availability: High
  • Model ID: meta-llama/Llama-2-70b-chat-hf
  • Context Length: 4,096 tokens
  • Best For: Open-source alternative, research
  • Pricing: Varies
  • Availability: Medium
  • Model ID: meta-llama/Llama-2-13b-chat-hf
  • Context Length: 4,096 tokens
  • Best For: Balanced performance and cost
  • Pricing: Varies
  • Availability: High
  • Model ID: meta-llama/Llama-2-7b-chat-hf
  • Context Length: 4,096 tokens
  • Best For: Fast, cost-effective responses
  • Pricing: Varies
  • Availability: High
  • Model ID: mistralai/Mistral-7B-Instruct-v0.1
  • Context Length: 8,192 tokens
  • Best For: Fast, efficient responses
  • Pricing: Varies
  • Availability: High
  • Model ID: sentence-transformers/all-MiniLM-L6-v2
  • Dimensions: 384
  • Best For: General-purpose embeddings
  • Pricing: Varies
  • Availability: High
  • Model ID: sentence-transformers/all-mpnet-base-v2
  • Dimensions: 768
  • Best For: High-quality embeddings
  • Pricing: Varies
  • Availability: High
  • Best: GPT-4, Claude 3 Opus
  • Good: GPT-3.5 Turbo, Claude 3 Sonnet
  • Budget: GPT-3.5 Turbo, Claude 3 Haiku
  • Best: GPT-4, Claude 3 Opus
  • Good: GPT-3.5 Turbo, Claude 3 Sonnet
  • Budget: GPT-3.5 Turbo, Claude 3 Haiku
  • Best: GPT-4, Claude 3 Opus
  • Good: Claude 3 Sonnet, GPT-3.5 Turbo
  • Budget: Claude 3 Haiku, GPT-3.5 Turbo
  • Best: GPT-3.5 Turbo, Claude 3 Haiku
  • Good: Gemini Pro, Command Light
  • Budget: GPT-3.5 Turbo, Claude 3 Haiku
  • Best: Text Embedding 3 Large, All-mpnet-base-v2
  • Good: Text Embedding Ada 002, All-MiniLM-L6-v2
  • Budget: Text Embedding 3 Small, All-MiniLM-L6-v2
  1. GPT-4
  2. Claude 3 Opus
  3. GPT-4 Turbo
  1. Claude 3 Haiku
  2. GPT-3.5 Turbo
  3. Gemini Pro
  1. Claude 3 Haiku
  2. GPT-3.5 Turbo
  3. Command Light
  1. GPT-4
  2. Claude 3 Opus
  3. GPT-4 Turbo
const modelConfig = {
model: "gpt-3.5-turbo",
temperature: 0.7,
max_tokens: 1000,
top_p: 0.9,
};
const advancedConfig = {
model: "gpt-4",
temperature: 0.7,
max_tokens: 2000,
top_p: 0.9,
frequency_penalty: 0.0,
presence_penalty: 0.0,
stop: ["\n\n", "Human:", "Assistant:"],
};
const smartRouting = {
model: "auto",
routing_strategy: "cost_optimized",
max_cost_per_request: 0.01,
quality_threshold: 0.8,
preferred_models: ["gpt-3.5-turbo", "claude-3-haiku"],
};
const models = ["gpt-3.5-turbo", "claude-3-haiku", "gemini-pro"];
const testPrompt = "Explain quantum computing in simple terms";
for (const model of models) {
const response = await unicraft.chat.completions.create({
model: model,
messages: [{ role: "user", content: testPrompt }],
max_tokens: 200,
});
console.log(`${model}: ${response.choices[0].message.content}`);
}
const performanceTest = async (model, prompt) => {
const start = Date.now();
const response = await unicraft.chat.completions.create({
model: model,
messages: [{ role: "user", content: prompt }],
max_tokens: 100,
});
const end = Date.now();
return {
model: model,
response_time: end - start,
cost: response.unicraft.cost,
quality: response.unicraft.quality_score,
};
};
  • Choose models based on your specific use case
  • Consider cost vs. quality trade-offs
  • Test multiple models before deciding
  • Use appropriate temperature settings
  • Set reasonable max_tokens limits
  • Configure stop sequences when needed
  • Use smart routing for automatic model selection
  • Implement caching for repeated requests
  • Monitor costs and performance
  • Track model performance metrics
  • Monitor costs and usage
  • Set up alerts for issues
  1. Model Not Available

    • Check provider configuration
    • Verify model availability
    • Use alternative models
  2. Poor Quality Responses

    • Adjust temperature settings
    • Improve prompt quality
    • Try different models
  3. High Costs

    • Use cost-effective models
    • Optimize prompts
    • Implement caching
  4. Slow Responses

    • Use faster models
    • Optimize requests
    • Check provider status
  1. Test Models: Test different models with your use case
  2. Monitor Metrics: Track performance and cost metrics
  3. Optimize Prompts: Improve prompt quality and structure
  4. Use Smart Routing: Let UniCraft choose the best model

For model-related support:

After selecting models:

  1. Test Models: Test different models with your use case
  2. Configure Routing: Set up smart routing rules
  3. Monitor Performance: Track model performance and costs
  4. Optimize Usage: Optimize based on performance data
  5. Scale as Needed: Add more models as needed