Local LLM models

Psychonaut · April 9, 2024

Has anyone played around with local LLM models?

I need a local model for use in medical products. I have played around with some 7B models and have had some good results, but my hardware is not strong enough to run 70B models which I would really like to do.

Models from America are not trustworthy as they can disable access at any time.

Lucasxp64 · April 20, 2024

Medical? I think you must be careful. I hope you know the drill about safety already. Hallucinations, fake info sounding real, etc. If a model like GPT-4 hallucinates, smaller models are hopeless for safety.

You can use Kaggle free GPU beyond Google Colab for example.

Beyond that, you can fine-tune on a powerful gpu your models (by paying some form of cloud compute), and deploy them as smaller fine-tuned models, that's the only way to overcome the quality-performance issue.

Also, there are lots of deployment possibilities that would allow for stuff like running in a chain the model so its output would be better (in theory).:

https://langchain.readthedocs.io/en/latest/

Also, perhaps for your needs, you just need a really good local search tool, that uses an llm tech to return info from a trusted body of work. Search for that. The simplest is simply vector databases.

I'd side with tech that will return text from a trusted body of work, and using the LLM just to interpret the users input, as opposed to relying on it generating.

The entire Wikipedia english raw text is about 14GB for example.

Edited April 20, 2024 by Lucasxp64

Bobby_2021 · April 20, 2024

@Psychonaut Have you tried Llama? It's getting so good nowadays.

Sign In

Local LLM models

3 posts in this topic

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in