LocalAI: A drop-in replacement for OpenAI
-
LocalAI LaMA, alpaca, gpt4all, vicuna, koala, gpt4all-j
Self-hosted, community-driven simple local OpenAI-compatible API written in go. Can be used as a drop-in replacement for OpenAI, running on CPU with consumer-grade hardware. Supports ggml compatible models: LLaMA, alpaca, gpt4all, vicuna, koala, gpt4all-j
Using LocalAI is straightforward and easy. You can simply install LocalAI on your local machine or server via docker and start performing inferencing tasks immediately, no more talkings let's start ,
-
Install docker in your pc or server ( installation depend on the os type, check here)
-
Open terminal or cmd and clone the LocalAi repo from github
git clone https://github.com/go-skynet/LocalAI
-
Go to the
LocalAi/models
folder in terminalcd LocalAi/models
-
Download the model ( in here i use
gpt4all-j model
, this model coming with Apache 2.0 Licensed , it can be used for commercial purposes.)wget https://gpt4all.io/models/ggml-gpt4all-j.bin
in here i use wget for download, you can download bin file manually and copy paste to the
LocalAi/models
folder -
Come back to the
LocalAi
root -
Start with docker-compose
docker compose up -d --build
After above process finished, let's call our LocalAi via terminal or cmd , in here i use curl you can use also any other tool can perform http request ( postman, etc.. )
curl http://localhost:8080/v1/models
This request showing what are the models we have added to the models directory
Let's call the AI with actual Prompt.
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{ "model": "ggml-gpt4all-j", "prompt": "Explain AI to me like A five-year-old", "temperature": 0.7 }'
Windows compatibility
It should work, however you need to make sure you give enough resources to the container. See
Kubernetes
You can run the API in Kubernetes, see an example deployment in kubernetes
API Support
LocalAI provides an API for running text generation as a service, that follows the OpenAI reference and can be used as a drop-in. The models once loaded the first time will be kept in memory.Example of starting the API with
docker
:docker run -p 8080:8080 -ti --rm quay.io/go-skynet/local-api:latest --models-path /path/to/models --context-size 700 --threads 4
Then you'll see:
┌───────────────────────────────────────────────────┐ │ Fiber v2.42.0 │ │ http://127.0.0.1:8080 │ │ (bound on host 0.0.0.0 and port 8080) │ │ │ │ Handlers ............. 1 Processes ........... 1 │ │ Prefork ....... Disabled PID ................. 1 │ └───────────────────────────────────────────────────┘
if you want more info about API, go to the github page
https://github.com/go-skynet/LocalAI#api -