Turn your Mac into a local ChatGpt with the new free model of Openai

This week, Openai released its long-awaited open-weight model called GPT-OSS. Part of the appeal of GPT-oss is that it can run locally on its own hardware, including Macs with Apple Silicon. Here’s how you can get started and expect:

Model and Mac

First of all, GPT-oss comes in two flavors. GPT-OSS-20B and GPT-OSS-120B. The former is described as a medium open weight model, while the latter is considered a heavy open weight model.

The medium model is something you can expect to run locally with ample resources on an Apple Silicon Mac. difference? Due to differences in dataset size, expect smaller models to be more hallucinated compared to larger models. This is a trade-off between faster models that can actually be run on high-end Macs.

Still, small models are neat tools that are freely available if you have a MAC with ample resources and have a curiosity about running large language models locally.

You also need to be aware of the differences between running local models compared to ChatGpt, for example. By default, the Open Weight Local model has a lot of modern chatbot features that make ChatGPT useful. For example, the answer does not include considerations of web outcomes that often limit hallucinations.

Openai recommends running at least 16GB RAM on GPT-OSS-20B, but Macs with more RAM clearly get better performance. Based on early user feedback, 16GB RAM is the floor of what you need to actually experiment. (AI is a big reason why Apple has stopped selling Macs with 8GB RAM.

Setup and use

Preamble aside, it’s really easy to get started.

First, install Ollama on your Mac. This is essentially a window for interfacing with the GPT-OSS-20B. You can find the app at ollama.com/download or download the Mac version from this download link.

Next, open the terminal on your Mac and enter the following command:

ollama pull gpt-oss:20b
ollama run gpt-oss:20b

This will prompt the MAC to download the GPT-OSS-20B, using approximately 15GB of disk storage.

Finally, you can start Ollama and select the GPT-OSS-20B as the model. You can also place Ollama in plane mode in the settings panel of the app to make sure everything is happening locally. There is no need to sign in.

To test the GPT-OSS-20B, simply enter a prompt in the text field and see the model works. Again, the hardware resources determine the performance of the model here. Oramas use as much resources as possible when running the model, so while the model thinks it can be slow until your Mac crawls.

My best Mac is the 15-inch M4 MacBook Air with 16GB RAM. The model works, but it’s also a tall order for my experiments on my machine. It took me over 5 minutes to respond to “Hello.” Responding to “Who was President 13th” it took me a little longer in about 43 minutes. If you’re planning on experimenting for more than a few minutes, you really want more ram.

Would you like to delete the local model and regain that disk space? Enter this terminal command.

ollama rm gpt-oss:20b

For more information on using Ollama with GPT-OSS-20B on a Mac, see this official resource. Alternatively, you can use LM Studio, another Mac app for working with AI models.

(tagStoTRASSLATE) AI

Tags:

We will be happy to hear your thoughts

Leave a reply

Cyberstorehut
Logo
Shopping cart