LocalAI: A drop-in replacement for OpenAI

root

LocalAI

LaMA, alpaca, gpt4all, vicuna, koala, gpt4all-j

Self-hosted, community-driven simple local OpenAI-compatible API written in go. Can be used as a drop-in replacement for OpenAI, running on CPU with consumer-grade hardware. Supports ggml compatible models: LLaMA, alpaca, gpt4all, vicuna, koala, gpt4all-j

Using LocalAI is straightforward and easy. You can simply install LocalAI on your local machine or server via docker and start performing inferencing tasks immediately, no more talkings let's start ,

Install docker in your pc or server ( installation depend on the os type, check here)
Open terminal or cmd and clone the LocalAi repo from github
```
  git clone https://github.com/go-skynet/LocalAI
```
Go to the LocalAi/models folder in terminal
```
  cd LocalAi/models
```
Download the model ( in here i use gpt4all-j model , this model coming with Apache 2.0 Licensed , it can be used for commercial purposes.)
```
 wget https://gpt4all.io/models/ggml-gpt4all-j.bin
```
in here i use wget for download, you can download bin file manually and copy paste to the LocalAi/models folder
Come back to the LocalAi root
Start with docker-compose
```
   docker compose up -d --build
```

After above process finished, let's call our LocalAi via terminal or cmd , in here i use curl you can use also any other tool can perform http request ( postman, etc.. )

curl http://localhost:8080/v1/models

This request showing what are the models we have added to the models directory

Let's call the AI with actual Prompt.

curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-gpt4all-j",            
     "prompt": "Explain AI to me like A five-year-old",
     "temperature": 0.7
   }'

Windows compatibility

It should work, however you need to make sure you give enough resources to the container. See

Kubernetes
You can run the API in Kubernetes, see an example deployment in kubernetes

API Support
LocalAI provides an API for running text generation as a service, that follows the OpenAI reference and can be used as a drop-in. The models once loaded the first time will be kept in memory.

Example of starting the API with docker:

docker run -p 8080:8080 -ti --rm quay.io/go-skynet/local-api:latest --models-path /path/to/models --context-size 700 --threads 4

Then you'll see:

┌───────────────────────────────────────────────────┐ 
│                   Fiber v2.42.0                   │ 
│               http://127.0.0.1:8080               │ 
│       (bound on host 0.0.0.0 and port 8080)       │ 
│                                                   │ 
│ Handlers ............. 1  Processes ........... 1 │ 
│ Prefork ....... Disabled  PID ................. 1 │ 
└───────────────────────────────────────────────────┘

if you want more info about API, go to the github page
https://github.com/go-skynet/LocalAI#api

LocalAI: A drop-in replacement for OpenAI

LocalAI

LaMA, alpaca, gpt4all, vicuna, koala, gpt4all-j

5
Online

5.4k
Users

2.1k
Topics

6.1k
Posts

LocalAI: A drop-in replacement for OpenAI

LocalAI

LaMA, alpaca, gpt4all, vicuna, koala, gpt4all-j

5Online

5.4kUsers

2.1kTopics

6.1kPosts

5
Online

5.4k
Users

2.1k
Topics

6.1k
Posts