2024
ollama 官页 ollama github OpenAI Translator: https://github.com/openai-translator/openai-translator
Ollama
实例
安装
下载安装之 https://ollama.com/
配置
Ollama 下载的模型模型保存在 C 盘,如果想更改默认路径的话,可以通过设置 OLLAMA_MODELS 进行修改。
OLLAMA_MODELS: F:\OllamaModels
Ollama 默认提供 OpenAI 的兼容 API,默认端口是 11434,默认只可以通过 localhost 进行访问,如果想公开访问的话,可以通过设置 OLLAMA_HOST 进行修改。
OLLAMA_HOST:192.168.3.130
命令行
C:\Users\yang>ollama
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
pull Pull a model from a registry
push Push a model to a registry
list List models
cp Copy a model
rm Remove a model
help Help about any command直接 ollama run 模型名称即可
支持模型列表
Ollama supports a list of models available on ollama.com/library
Here are some example models that can be downloaded:
| Model | Parameters | Size | Download |
|---|---|---|---|
| Llama 2 | 7B | 3.8GB | ollama run llama2 |
| Mistral | 7B | 4.1GB | ollama run mistral |
| Dolphin Phi | 2.7B | 1.6GB | ollama run dolphin-phi |
| Phi-2 | 2.7B | 1.7GB | ollama run phi |
| Neural Chat | 7B | 4.1GB | ollama run neural-chat |
| Starling | 7B | 4.1GB | ollama run starling-lm |
| Code Llama | 7B | 3.8GB | ollama run codellama |
| Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
| Llama 2 13B | 13B | 7.3GB | ollama run llama2:13b |
| Llama 2 70B | 70B | 39GB | ollama run llama2:70b |
| Orca Mini | 3B | 1.9GB | ollama run orca-mini |
| Vicuna | 7B | 3.8GB | ollama run vicuna |
| LLaVA | 7B | 4.5GB | ollama run llava |
| Gemma | 2B | 1.4GB | ollama run gemma:2b |
| Gemma | 7B | 4.8GB | ollama run gemma:7b |
Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models. 应该至少有8 GB的RAM可用于运行7B型号,16 GB可用于运行13B型号,32 GB可用于执行33B型号。
Gemma
Gemma is a new open model developed by Google and its DeepMind team. It’s inspired by Gemini models at Google.
Gemma is available in both 2b and 7b parameter sizes:
ollama run gemma:2bollama run gemma:7b(default)
The models undergo training on a diverse dataset of web documents to expose them to a wide range of linguistic styles, topics, and vocabularies. This includes code to learn syntax and patterns of programming languages, as well as mathematical text to grasp logical reasoning.
ollama run gemma:7b ollama run llama2:13b
使用 open-webui
安装后只能使用命令行, 或者使用它的 API.. 装一个 open-webui
https://github.com/open-webui/open-webui https://docs.openwebui.com/getting-started/
git clone https://github.com/open-webui/open-webui.git
cd open-webui/
# 编译前端
npm i
npm run build
# 需要对接 ollama 接口, 所以还一个后端
# Serving Frontend with the Backend
cd ./backend
pip install -r requirements.txt -U
start_windows.sh # windows 使用这个脚本
运行 ollama serve
C:\Users\yang>ollama serve -h
Start ollama
Usage:
ollama serve [flags]
Aliases:
serve, start
Flags:
-h, --help help for serve
Environment Variables:
OLLAMA_HOST The host:port to bind to (default "127.0.0.1:11434")
OLLAMA_ORIGINS A comma separated list of allowed origins.
OLLAMA_MODELS The path to the models directory (default is "~/.ollama/models")
OLLAMA_KEEP_ALIVE The duration that models stay loaded in memory (default is "5m")
ollama serve
启动 webui docker
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://192.168.20.104:11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
# ghcr 太鸡儿慢了, 使用镜像地址
# 将 gcr.io 替换为 gcr.nju.edu.cn
# 将 k8s.gcr.io 替换为 gcr.nju.edu.cn/google-containers
# 将 ghcr.io 替换为 ghcr.nju.edu.cn
# 将 nvcr.io 替换为 ngc.nju.edu.cn
# 将 quay.io 替换为 quay.nju.edu.cn
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://192.168.128.1:11434 -v /home/yang/open-webui:/app/backend/data --name open-webui --restart always ghcr.nju.edu.cn/open-webui/open-webui:maindocker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://172.16.21.155:11434 -v /home/yang/open-webui:/app/backend/data —name open-webui —restart always ghcr.nju.edu.cn/open-webui/open-webui:main