2024

ollama 官页 ollama github OpenAI Translator: https://github.com/openai-translator/openai-translator

Ollama

实例

安装

下载安装之 https://ollama.com/

配置

Ollama 下载的模型模型保存在 C 盘,如果想更改默认路径的话,可以通过设置 OLLAMA_MODELS 进行修改。 OLLAMA_MODELS: F:\OllamaModels

Ollama 默认提供 OpenAI 的兼容 API,默认端口是 11434,默认只可以通过 localhost 进行访问,如果想公开访问的话,可以通过设置 OLLAMA_HOST 进行修改。 OLLAMA_HOST:192.168.3.130

命令行

C:\Users\yang>ollama
Usage:
  ollama [flags]
  ollama [command]
 
Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

直接 ollama run 模型名称即可

支持模型列表

Ollama supports a list of models available on ollama.com/library

Here are some example models that can be downloaded:

ModelParametersSizeDownload
Llama 27B3.8GBollama run llama2
Mistral7B4.1GBollama run mistral
Dolphin Phi2.7B1.6GBollama run dolphin-phi
Phi-22.7B1.7GBollama run phi
Neural Chat7B4.1GBollama run neural-chat
Starling7B4.1GBollama run starling-lm
Code Llama7B3.8GBollama run codellama
Llama 2 Uncensored7B3.8GBollama run llama2-uncensored
Llama 2 13B13B7.3GBollama run llama2:13b
Llama 2 70B70B39GBollama run llama2:70b
Orca Mini3B1.9GBollama run orca-mini
Vicuna7B3.8GBollama run vicuna
LLaVA7B4.5GBollama run llava
Gemma2B1.4GBollama run gemma:2b
Gemma7B4.8GBollama run gemma:7b

Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models. 应该至少有8 GB的RAM可用于运行7B型号,16 GB可用于运行13B型号,32 GB可用于执行33B型号。

Gemma

Gemma is a new open model developed by Google and its DeepMind team. It’s inspired by Gemini models at Google.

Gemma is available in both 2b and 7b parameter sizes:

  • ollama run gemma:2b
  • ollama run gemma:7b (default)

The models undergo training on a diverse dataset of web documents to expose them to a wide range of linguistic styles, topics, and vocabularies. This includes code to learn syntax and patterns of programming languages, as well as mathematical text to grasp logical reasoning.

ollama run gemma:7b ollama run llama2:13b

使用 open-webui

安装后只能使用命令行, 或者使用它的 API.. 装一个 open-webui

https://github.com/open-webui/open-webui https://docs.openwebui.com/getting-started/

git clone https://github.com/open-webui/open-webui.git
cd open-webui/

# 编译前端
npm i
npm run build


# 需要对接 ollama 接口, 所以还一个后端
# Serving Frontend with the Backend
cd ./backend
pip install -r requirements.txt -U
start_windows.sh # windows 使用这个脚本

运行 ollama serve

C:\Users\yang>ollama serve -h
Start ollama

Usage:
  ollama serve [flags]

Aliases:
  serve, start

Flags:
  -h, --help   help for serve

Environment Variables:
OLLAMA_HOST         The host:port to bind to (default "127.0.0.1:11434")
OLLAMA_ORIGINS      A comma separated list of allowed origins.
OLLAMA_MODELS       The path to the models directory (default is "~/.ollama/models")
OLLAMA_KEEP_ALIVE   The duration that models stay loaded in memory (default is "5m")

ollama serve 

启动 webui docker

docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://192.168.20.104:11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
 
 
#  ghcr 太鸡儿慢了, 使用镜像地址
# 将 gcr.io 替换为 gcr.nju.edu.cn
# 将 k8s.gcr.io 替换为 gcr.nju.edu.cn/google-containers
# 将 ghcr.io 替换为 ghcr.nju.edu.cn
# 将 nvcr.io 替换为 ngc.nju.edu.cn
# 将 quay.io 替换为 quay.nju.edu.cn
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://192.168.128.1:11434 -v /home/yang/open-webui:/app/backend/data --name open-webui --restart always ghcr.nju.edu.cn/open-webui/open-webui:main

docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://172.16.21.155:11434 -v /home/yang/open-webui:/app/backend/data —name open-webui —restart always ghcr.nju.edu.cn/open-webui/open-webui:main