🖼️ Available 32 models from 1 repositories

Filter by type:

Filter by tags:
meta-llama-3.1-8b-instruct
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. Model developer: Meta Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Repository: localaiLicense: llama3.1

meta-llama-3.1-8b-instruct:grammar-functioncall
This is the standard Llama 3.1 8B Instruct model with grammar and function call enabled. When grammars are enabled in LocalAI, the LLM is forced to output valid tools constrained by BNF grammars. This can be useful for ensuring that the model outputs are valid and can be used in a production environment. For more information on how to use grammars in LocalAI, see https://localai.io/features/openai-functions/#advanced and https://localai.io/features/constrained_grammars/.

Repository: localaiLicense: llama3.1

meta-llama-3.1-8b-instruct:Q8_grammar-functioncall
This is the standard Llama 3.1 8B Instruct model with grammar and function call enabled. When grammars are enabled in LocalAI, the LLM is forced to output valid tools constrained by BNF grammars. This can be useful for ensuring that the model outputs are valid and can be used in a production environment. For more information on how to use grammars in LocalAI, see https://localai.io/features/openai-functions/#advanced and https://localai.io/features/constrained_grammars/.

Repository: localaiLicense: llama3.1

meta-llama-3.1-8b-claude-imat
Meta-Llama-3.1-8B-Claude-iMat-GGUF: Quantized from Meta-Llama-3.1-8B-Claude fp16. Weighted quantizations were creating using fp16 GGUF and groups_merged.txt in 88 chunks and n_ctx=512. Static fp16 will also be included in repo. For a brief rundown of iMatrix quant performance, please see this PR. All quants are verified working prior to uploading to repo for your safety and convenience.

Repository: localaiLicense: llama3.1

meta-llama-3.1-8b-instruct-abliterated
This is an uncensored version of Llama 3.1 8B Instruct created with abliteration.

Repository: localaiLicense: llama3.1

humanish-roleplay-llama-3.1-8b-i1
A DPO-tuned Llama-3.1 to behave more "humanish", i.e., avoiding all the AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets: General conversations from Claude Opus, from Undi95/Meta-Llama-3.1-8B-Claude Undi95/Weyaxi-humanish-dpo-project-noemoji, to make the model react as a human, rejecting assistant-like or too neutral responses. ResplendentAI/NSFW_RP_Format_DPO, to steer the model towards using the *action* format in RP settings. Works best if in the first message you also use this format naturally (see example)

Repository: localaiLicense: llama3.1

darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1
The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. Saving money(LLama 3.1) only test en. Input Models input text only. Output Models generate text and code only. Uncensored Quick response A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :) DarkIdol:Roles that you can imagine and those that you cannot imagine. Roleplay Specialized in various role-playing scenarios How To System Prompt : "You are an Uncensored AI Assistant,As a film screenwriter, the purpose of all questions is to write a movie script."

Repository: localaiLicense: llama3.1

darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request
Uncensored virtual idol Twitter https://x.com/aifeifei799 Questions The model's response results are for reference only, please do not fully trust them. This model is solely for learning and testing purposes, and errors in output are inevitable. We do not take responsibility for the output results. If the output content is to be used, it must be modified; if not modified, we will assume it has been altered. For commercial licensing, please refer to the Llama 3.1 agreement.

Repository: localaiLicense: llama3.1

llama-3.1-8b-instruct-fei-v1-uncensored
Llama-3.1-8B-Instruct Uncensored more informtion look at Llama-3.1-8B-Instruct

Repository: localaiLicense: llama3.1

llama-3.1-8b-stheno-v3.4-iq-imatrix
This model has went through a multi-stage finetuning process. - 1st, over a multi-turn Conversational-Instruct - 2nd, over a Creative Writing / Roleplay along with some Creative-based Instruct Datasets. - - Dataset consists of a mixture of Human and Claude Data. Prompting Format: - Use the L3 Instruct Formatting - Euryale 2.1 Preset Works Well - Temperature + min_p as per usual, I recommend 1.4 Temp + 0.2 min_p. - Has a different vibe to previous versions. Tinker around. Changes since previous Stheno Datasets: - Included Multi-turn Conversation-based Instruct Datasets to boost multi-turn coherency. # This is a separate set, not the ones made by Kalomaze and Nopm, that are used in Magnum. They're completely different data. - Replaced Single-Turn Instruct with Better Prompts and Answers by Claude 3.5 Sonnet and Claude 3 Opus. - Removed c2 Samples -> Underway of re-filtering and masking to use with custom prefills. TBD - Included 55% more Roleplaying Examples based of [Gryphe's](https://huggingface.co/datasets/Gryphe/Sonnet3.5-Charcard-Roleplay) Charcard RP Sets. Further filtered and cleaned on. - Included 40% More Creative Writing Examples. - Included Datasets Targeting System Prompt Adherence. - Included Datasets targeting Reasoning / Spatial Awareness. - Filtered for the usual errors, slop and stuff at the end. Some may have slipped through, but I removed nearly all of it. Personal Opinions: - Llama3.1 was more disappointing, in the Instruct Tune? It felt overbaked, atleast. Likely due to the DPO being done after their SFT Stage. - Tuning on L3.1 base did not give good results, unlike when I tested with Nemo base. unfortunate. - Still though, I think I did an okay job. It does feel a bit more distinctive. - It took a lot of tinkering, like a LOT to wrangle this.

Repository: localaiLicense: llama3.1

llama-3.1-8b-arliai-rpmax-v1.1
RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.

Repository: localaiLicense: llama3.1

llama-3.1-8b-arliai-formax-v1.0-iq-arm-imatrix
Quants for ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0. "Formax is a model that specializes in following response format instructions. Tell it the format of it's response and it will follow it perfectly. Great for data processing and dataset creation tasks." "It is also a highly uncensored model that will follow your instructions very well."

Repository: localaiLicense: llama3.1

hermes-3-llama-3.1-8b-lorablated
This is an uncensored version of NousResearch/Hermes-3-Llama-3.1-8B using lorablation. The recipe is simple: Extraction: We extract a LoRA adapter by comparing two models: a censored Llama 3.1 (meta-llama/Meta-Llama-3-8B-Instruct) and an abliterated Llama 3.1 (mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated). Merge: We merge this new LoRA adapter using task arithmetic to the censored NousResearch/Hermes-3-Llama-3.1-8B to abliterate it.

Repository: localaiLicense: llama3.1

theia-llama-3.1-8b-v1
Theia-Llama-3.1-8B-v1 is an open-source large language model (LLM) trained specifically in the cryptocurrency domain. It was fine-tuned from the Llama-3.1-8B base model using a dataset curated from top 2000 cryptocurrency projects and comprehensive research reports to specialize in crypto-related tasks. Theia-Llama-3.1-8B-v1 has been quantized to optimize it for efficient deployment and reduced memory footprint. It's benchmarked highly for crypto knowledge comprehension and generation, knowledge coverage, and reasoning capabilities. The system prompt used for its training is "You are a helpful assistant who will answer crypto related questions." The recommended parameters for performance include sequence length of 256, temperature of 0, top-k-sampling of -1, top-p of 1, and context window of 39680.

Repository: localaiLicense: llama3.1

llama-3.1-8b-arliai-rpmax-v1.3
RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations. Many RPMax users mentioned that these models does not feel like any other RP models, having a different writing style and generally doesn't feel in-bred.

Repository: localaiLicense: llama3.1

llama-3.1-8b-instruct-ortho-v3
A few different attempts at orthogonalization/abliteration of llama-3.1-8b-instruct using variations of the method from "Mechanistically Eliciting Latent Behaviors in Language Models". Each of these use different vectors and have some variations in where the new refusal boundaries lie. None of them seem totally jailbroken.

Repository: localaiLicense: llama3.1

skywork-o1-open-llama-3.1-8b
We are excited to announce the release of the Skywork o1 Open model series, developed by the Skywork team at Kunlun Inc. This groundbreaking release introduces a series of models that incorporate o1-like slow thinking and reasoning capabilities. The Skywork o1 Open model series includes three advanced models: Skywork o1 Open-Llama-3.1-8B: A robust chat model trained on Llama-3.1-8B, enhanced significantly with "o1-style" data to improve reasoning skills. Skywork o1 Open-PRM-Qwen-2.5-1.5B: A specialized model designed to enhance reasoning capability through incremental process rewards, ideal for complex problem solving at a smaller scale. Skywork o1 Open-PRM-Qwen-2.5-7B: Extends the capabilities of the 1.5B model by scaling up to handle more demanding reasoning tasks, pushing the boundaries of AI reasoning. Different from mere reproductions of the OpenAI o1 model, the Skywork o1 Open model series not only exhibits innate thinking, planning, and reflecting capabilities in its outputs, but also shows significant improvements in reasoning skills on standard benchmarks. This series represents a strategic advancement in AI capabilities, moving a previously weaker base model towards the state-of-the-art (SOTA) in reasoning tasks.

Repository: localaiLicense: llama3.1

sparse-llama-3.1-8b-2of4
This is the 2:4 sparse version of Llama-3.1-8B. On the OpenLLM benchmark (version 1), it achieves an average score of 62.16, compared to 63.19 for the dense model—demonstrating a 98.37% accuracy recovery. On the Mosaic Eval Gauntlet benchmark (version v0.3), it achieves an average score of 53.85, versus 55.34 for the dense model—representing a 97.3% accuracy recovery.

Repository: localaiLicense: llama3.1

loki-v2.6-8b-1024k
The following models were included in the merge: MrRobotoAI/Epic_Fiction-8b MrRobotoAI/Unaligned-RP-Base-8b-1024k MrRobotoAI/Loki-.Epic_Fiction.-8b Casual-Autopsy/L3-Luna-8B Casual-Autopsy/L3-Super-Nova-RP-8B Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B Casual-Autopsy/Halu-L3-Stheno-BlackOasis-8B Undi95/Llama-3-LewdPlay-8B Undi95/Llama-3-LewdPlay-8B-evo Undi95/Llama-3-Unholy-8B ChaoticNeutrals/Hathor_Tahsin-L3-8B-v0.9 ChaoticNeutrals/Hathor_RP-v.01-L3-8B ChaoticNeutrals/Domain-Fusion-L3-8B ChaoticNeutrals/T-900-8B ChaoticNeutrals/Poppy_Porpoise-1.4-L3-8B ChaoticNeutrals/Templar_v1_8B ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8 ChaoticNeutrals/Sekhmet_Gimmel-L3.1-8B-v0.3 zeroblu3/LewdPoppy-8B-RP tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b jeiku/Chaos_RP_l3_8B tannedbum/L3-Nymeria-Maid-8B Nekochu/Luminia-8B-RP vicgalle/Humanish-Roleplay-Llama-3.1-8B saishf/SOVLish-Maid-L3-8B Dogge/llama-3-8B-instruct-Bluemoon-Freedom-RP MrRobotoAI/Epic_Fiction-8b-v4 maldv/badger-lambda-0-llama-3-8b maldv/llama-3-fantasy-writer-8b maldv/badger-kappa-llama-3-8b maldv/badger-mu-llama-3-8b maldv/badger-lambda-llama-3-8b maldv/badger-iota-llama-3-8b maldv/badger-writer-llama-3-8b Magpie-Align/MagpieLM-8B-Chat-v0.1 nbeerbower/llama-3-gutenberg-8B nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K nbeerbower/llama-3-spicy-abliterated-stella-8B Magpie-Align/MagpieLM-8B-SFT-v0.1 NeverSleep/Llama-3-Lumimaid-8B-v0.1 mlabonne/NeuralDaredevil-8B-abliterated mlabonne/Daredevil-8B-abliterated NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS nothingiisreal/L3-8B-Instruct-Abliterated-DWP openchat/openchat-3.6-8b-20240522 turboderp/llama3-turbcat-instruct-8b UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3 Undi95/Llama-3-LewdPlay-8B TIGER-Lab/MAmmoTH2-8B-Plus OwenArli/Awanllm-Llama-3-8B-Cumulus-v1.0 refuelai/Llama-3-Refueled SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha NousResearch/Hermes-2-Theta-Llama-3-8B ResplendentAI/Nymph_8B grimjim/Llama-3-Oasis-v1-OAS-8B flammenai/Mahou-1.3b-llama3-8B lemon07r/Llama-3-RedMagic4-8B grimjim/Llama-3.1-SuperNova-Lite-lorabilterated-8B grimjim/Llama-Nephilim-Metamorphosis-v2-8B lemon07r/Lllama-3-RedElixir-8B grimjim/Llama-3-Perky-Pat-Instruct-8B ChaoticNeutrals/Hathor_RP-v.01-L3-8B grimjim/llama-3-Nephilim-v2.1-8B ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8 migtissera/Llama-3-8B-Synthia-v3.5 Locutusque/Llama-3-Hercules-5.0-8B WhiteRabbitNeo/Llama-3-WhiteRabbitNeo-8B-v2.0 VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct iRyanBell/ARC1-II HPAI-BSC/Llama3-Aloe-8B-Alpha HaitameLaf/Llama-3-8B-StoryGenerator failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 Undi95/Llama-3-Unholy-8B ajibawa-2023/Uncensored-Frank-Llama-3-8B ajibawa-2023/SlimOrca-Llama-3-8B ChaoticNeutrals/Templar_v1_8B aifeifei798/llama3-8B-DarkIdol-2.2-Uncensored-1048K ChaoticNeutrals/Hathor_Tahsin-L3-8B-v0.9 Blackroot/Llama-3-Gamma-Twist FPHam/L3-8B-Everything-COT Blackroot/Llama-3-LongStory ChaoticNeutrals/Sekhmet_Gimmel-L3.1-8B-v0.3 abacusai/Llama-3-Smaug-8B Khetterman/CursedMatrix-8B-v9 ajibawa-2023/Scarlett-Llama-3-8B-v1.0 MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/physics_non_masked MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/electrical_engineering MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_chemistry MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy_non_masked MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_physics MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/formal_logic MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy_100 MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/conceptual_physics MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_computer_science MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/psychology_non_masked MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/psychology MrRobotoAI/Unaligned-RP-Base-8b-1024k + Blackroot/Llama3-RP-Lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LimaRP-Instruct-LoRA-8B MrRobotoAI/Unaligned-RP-Base-8b-1024k + nothingiisreal/llama3-8B-DWP-lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/world_religions MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/high_school_european_history MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/electrical_engineering MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-8B-Abomination-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LongStory-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/human_sexuality MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/sociology MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Theory_of_Mind_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Smarts_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LongStory-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Nimue-8B MrRobotoAI/Unaligned-RP-Base-8b-1024k + vincentyandex/lora_llama3_chunked_novel_bs128 MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Aura_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/L3-Daybreak-8b-lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Luna_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + nicce/story-mixtral-8x7b-lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + Blackroot/Llama-3-LongStory-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/NoWarning_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/BlueMoon_Llama3

Repository: localaiLicense: llama3.1

fusechat-llama-3.1-8b-instruct
We present FuseChat-3.0, a series of models crafted to enhance performance by integrating the strengths of multiple source LLMs into more compact target LLMs. To achieve this fusion, we utilized four powerful source LLMs: Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct. For the target LLMs, we employed three widely-used smaller models—Llama-3.1-8B-Instruct, Gemma-2-9B-It, and Qwen-2.5-7B-Instruct—along with two even more compact models—Llama-3.2-3B-Instruct and Llama-3.2-1B-Instruct. The implicit model fusion process involves a two-stage training pipeline comprising Supervised Fine-Tuning (SFT) to mitigate distribution discrepancies between target and source LLMs, and Direct Preference Optimization (DPO) for learning preferences from multiple source LLMs. The resulting FuseChat-3.0 models demonstrated substantial improvements in tasks related to general conversation, instruction following, mathematics, and coding. Notably, when Llama-3.1-8B-Instruct served as the target LLM, our fusion approach achieved an average improvement of 6.8 points across 14 benchmarks. Moreover, it showed significant improvements of 37.1 and 30.1 points on instruction-following test sets AlpacaEval-2 and Arena-Hard respectively. We have released the FuseChat-3.0 models on Huggingface, stay tuned for the forthcoming dataset and code.

Repository: localaiLicense: llama3.1

llama-3.1-8b-open-sft
The Llama-3.1-8B-Open-SFT model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct, designed for advanced text generation tasks, including conversational interactions, question answering, and chain-of-thought reasoning. This model leverages Supervised Fine-Tuning (SFT) using the O1-OPEN/OpenO1-SFT dataset to provide enhanced performance in context-sensitive and instruction-following tasks.

Repository: localaiLicense: llama3.1

selene-1-mini-llama-3.1-8b
Atla Selene Mini is a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini achieves comparable performance to models 10x its size, outperforming GPT-4o on RewardBench, EvalBiasBench, and AutoJ. Post-trained from Llama-3.1-8B across a wide range of evaluation tasks and scoring criteria, Selene Mini outperforms prior small models overall across 11 benchmarks covering three different types of tasks: Absolute scoring, e.g. "Evaluate the harmlessness of this response on a scale of 1-5" Classification, e.g. "Does this response address the user query? Answer Yes or No." Pairwise preference. e.g. "Which of the following responses is more logically consistent - A or B?" It is also the #1 8B generative model on RewardBench.

Repository: localaiLicense: llama3.1

locutusque_thespis-llama-3.1-8b
The Thespis family of language models is designed to enhance roleplaying performance through reasoning inspired by the Theory of Mind. Thespis-Llama-3.1-8B is a fine-tuned version of an abliterated Llama-3.1-8B model, optimized using Group Relative Policy Optimization (GRPO). The model is specifically rewarded for minimizing "slop" and repetition in its outputs, aiming to produce coherent and engaging text that maintains character consistency and avoids low-quality responses. This version represents an initial release; future iterations will incorporate a more rigorous fine-tuning process.

Repository: localaiLicense: llama3.1

llama-3.1-8b-instruct-uncensored-delmat-i1
Decensored using a custom training script guided by activations, similar to ablation/"abliteration" scripts but not exactly the same approach. I've found this effect to be stronger than most abliteration scripts, so please use responsibly etc etc. The training script is released under the MIT license: https://github.com/nkpz/DeLMAT

Repository: localaiLicense: llama3.1

lolzinventor_meta-llama-3.1-8b-survivev3
Primary intended uses: Providing survival tips and information Answering questions related to outdoor skills and wilderness survival Offering guidance on shelter building Out-of-scope uses: Medical advice or emergency response (users should always seek professional help in emergencies) Legal advice related to wilderness regulations or land use

Repository: localaiLicense: llama3.1

llmevollama-3.1-8b-v0.1-i1
This project aims to optimize model merging by integrating LLMs into evolutionary strategies in a novel way. Instead of using the CMA-ES approach, the goal is to improve model optimization by leveraging the search capabilities of LLMs to explore the parameter space more efficiently and adjust the search scope based on high-performing solutions. Currently, the project supports optimization only within the Parameter Space, but I plan to extend its functionality to enable merging and optimization in the Data Flow Space as well. This will further enhance model merging by optimizing the interaction between data flow and parameters.

Repository: localaiLicense: llama3.1

jdineen_llama-3.1-8b-think
This model is a fine-tuned version of Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2 on the jdineen/grpo-with-thinking-500-tagged dataset. It has been trained using TRL.

Repository: localaiLicense: llama3.1

nvidia_llama-3.1-8b-ultralong-1m-instruct
We introduce UltraLong-8B, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.

Repository: localaiLicense: llama3.1

nvidia_llama-3.1-8b-ultralong-4m-instruct
We introduce UltraLong-8B, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.

Repository: localaiLicense: llama3.1

hermes-3-llama-3.1-8b
Hermes 3 is a generalist language model developed by Nous Research. It is an advanced agentic model with improved roleplaying, reasoning, multi-turn conversation, long context coherence, and generalist assistant capabilities. The model is built on top of the Llama-3 architecture and has been fine-tuned to achieve superior performance in various tasks. It is designed to be a powerful and reliable tool for solving complex problems and assisting users in achieving their goals. Hermes 3 can be used for a wide range of applications, including research, education, and personal assistant tasks. It is available on the Hugging Face model hub for easy access and integration into existing workflows.

Repository: localaiLicense: apache-2.0

hermes-3-llama-3.1-8b:Q8
Hermes 3 is a generalist language model developed by Nous Research. It is an advanced agentic model with improved roleplaying, reasoning, multi-turn conversation, long context coherence, and generalist assistant capabilities. The model is built on top of the Llama-3 architecture and has been fine-tuned to achieve superior performance in various tasks. It is designed to be a powerful and reliable tool for solving complex problems and assisting users in achieving their goals. Hermes 3 can be used for a wide range of applications, including research, education, and personal assistant tasks. It is available on the Hugging Face model hub for easy access and integration into existing workflows.

Repository: localaiLicense: apache-2.0

hermes-3-llama-3.1-8b:vllm
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. It is designed to focus on aligning LLMs to the user, with powerful steering capabilities and control given to the end user. The model uses ChatML as the prompt format, opening up a much more structured system for engaging the LLM in multi-turn chat dialogue. It also supports function calling and structured output capabilities, generalist assistant capabilities, and improved code generation skills.

Repository: localaiLicense: llama-3