Ollama

实例

安装

配置

Ollama 下载的模型模型保存在 C 盘，如果想更改默认路径的话，可以通过设置 OLLAMA_MODELS 进行修改。 OLLAMA_MODELS: F:\OllamaModels

Ollama 默认提供 OpenAI 的兼容 API，默认端口是 11434，默认只可以通过 localhost 进行访问，如果想公开访问的话，可以通过设置 OLLAMA_HOST 进行修改。 OLLAMA_HOST：192.168.3.130

命令行

C:\Users\yang>ollama
Usage:
  ollama [flags]
  ollama [command]
 
Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

直接 ollama run 模型名称即可

支持模型列表

Ollama supports a list of models available on ollama.com/library

Here are some example models that can be downloaded:

Model	Parameters	Size	Download
Llama 2	7B	3.8GB	`ollama run llama2`
Mistral	7B	4.1GB	`ollama run mistral`
Dolphin Phi	2.7B	1.6GB	`ollama run dolphin-phi`
Phi-2	2.7B	1.7GB	`ollama run phi`
Neural Chat	7B	4.1GB	`ollama run neural-chat`
Starling	7B	4.1GB	`ollama run starling-lm`
Code Llama	7B	3.8GB	`ollama run codellama`
Llama 2 Uncensored	7B	3.8GB	`ollama run llama2-uncensored`
Llama 2 13B	13B	7.3GB	`ollama run llama2:13b`
Llama 2 70B	70B	39GB	`ollama run llama2:70b`
Orca Mini	3B	1.9GB	`ollama run orca-mini`
Vicuna	7B	3.8GB	`ollama run vicuna`
LLaVA	7B	4.5GB	`ollama run llava`
Gemma	2B	1.4GB	`ollama run gemma:2b`
Gemma	7B	4.8GB	`ollama run gemma:7b`

Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models. 应该至少有8 GB的RAM可用于运行7B型号，16 GB可用于运行13B型号，32 GB可用于执行33B型号。

Gemma

Gemma is a new open model developed by Google and its DeepMind team. It’s inspired by Gemini models at Google.

Gemma is available in both 2b and 7b parameter sizes:

ollama run gemma:2b
ollama run gemma:7b (default)

The models undergo training on a diverse dataset of web documents to expose them to a wide range of linguistic styles, topics, and vocabularies. This includes code to learn syntax and patterns of programming languages, as well as mathematical text to grasp logical reasoning.

ollama run gemma:7b ollama run llama2:13b

使用 open-webui

安装后只能使用命令行, 或者使用它的 API.. 装一个 open-webui

https://github.com/open-webui/open-webui https://docs.openwebui.com/getting-started/

git clone https://github.com/open-webui/open-webui.git
cd open-webui/

# 编译前端
npm i
npm run build


# 需要对接 ollama 接口, 所以还一个后端
# Serving Frontend with the Backend
cd ./backend
pip install -r requirements.txt -U
start_windows.sh # windows 使用这个脚本

运行 ollama serve

C:\Users\yang>ollama serve -h
Start ollama

Usage:
  ollama serve [flags]

Aliases:
  serve, start

Flags:
  -h, --help   help for serve

Environment Variables:
OLLAMA_HOST         The host:port to bind to (default "127.0.0.1:11434")
OLLAMA_ORIGINS      A comma separated list of allowed origins.
OLLAMA_MODELS       The path to the models directory (default is "~/.ollama/models")
OLLAMA_KEEP_ALIVE   The duration that models stay loaded in memory (default is "5m")

ollama serve

启动 webui docker

docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://192.168.20.104:11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
 
 
#  ghcr 太鸡儿慢了, 使用镜像地址
# 将 gcr.io 替换为 gcr.nju.edu.cn
# 将 k8s.gcr.io 替换为 gcr.nju.edu.cn/google-containers
# 将 ghcr.io 替换为 ghcr.nju.edu.cn
# 将 nvcr.io 替换为 ngc.nju.edu.cn
# 将 quay.io 替换为 quay.nju.edu.cn
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://192.168.128.1:11434 -v /home/yang/open-webui:/app/backend/data --name open-webui --restart always ghcr.nju.edu.cn/open-webui/open-webui:main

docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://172.16.21.155:11434 -v /home/yang/open-webui:/app/backend/data —name open-webui —restart always ghcr.nju.edu.cn/open-webui/open-webui:main

Quartz 4

Explorer

Ollama 实例一则

Ollama

实例

安装

配置

命令行

支持模型列表

Gemma

使用 open-webui

运行 ollama serve

启动 webui docker

Graph View

Table of Contents

Backlinks