LLMs
Artefact Azure-hosted GPT4-turbo
# backend/config.yaml
LLMConfig: &LLMConfig
source: AzureChatOpenAI
source_config:
openai_api_type: azure
openai_api_key: {{ OPENAI_API_KEY }}
openai_api_base: https://genai-ds.openai.azure.com/
openai_api_version: 2023-07-01-preview
deployment_name: gpt4
temperature: 0.1
Local llama2
You will first need to install and run Ollama
Download the Ollama application here
Ollama will automatically utilize the GPU on Apple devices.
ollama run llama2
# backend/config.yaml
LLMConfig: &LLMConfig
source: ChatOllama
source_config:
model: llama2
Vertex AI gemini-pro
Warning
Right now Gemini models' safety settings are very sensitive, and is is not possible to disable them. That makes this model pretty much useless for the time being as it blocks most requests and/or responses.
Github issue to follow: https://github.com/langchain-ai/langchain/pull/15344#issuecomment-1888597151
You will first need to login to GCP
export PROJECT_ID=<gcp_project_id>
gcloud config set project $PROJECT_ID
gcloud auth login
gcloud auth application-default login
# backend/config.yaml
LLMConfig: &LLMConfig
source: ChatVertexAI
source_config:
model_name: gemini-pro
temperature: 0.1