![]() ![]() I contacted Hugging Face for clarification on dual licensing but they do not yet have an official position. ![]() It should therefore be considered as being claimed to be licensed under both licenses. The creator of the source model has listed its license as cc-by-nc-nd-4.0, and this quantization has therefore used that same license.Īs this model is based on Llama 2, it is also subject to the Meta Llama 2 license terms, and the license files for that are additionally included. Prompt template: Instruction-Assistant-Hashes Devon M's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions.2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference.GPTQ models for GPU inference, with multiple quantisation parameter options.Hugging Face Text Generation Inference (TGI).Text Generation Webui - using Loader: AutoAWQ.Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings. This repo contains AWQ model files for Devon M's Mistral Pygmalion 7B.ĪWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. TheBloke's LLM work is generously supported by a grant from andreessen horowitz (a16z)
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |