Nexesenex's picture
Update README.md
7d8e1c3 verified
|
raw
history blame
3.31 kB
metadata
base_model:
  - huihui-ai/Llama-3.1-Tulu-3-70B-abliterated
  - migtissera/Tess-3-Llama-3.1-70B
  - Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02
  - huihui-ai/Tess-R1-Limerick-Llama-3.1-70B-abliterated
  - mlabonne/Hermes-3-Llama-3.1-70B-lorablated
  - nbeerbower/Llama-3.1-Nemotron-lorablated-70B
library_name: transformers
tags:
  - mergekit
  - merge

about

Original name : Llama_3.x_70b_Dolnemhertulimtess_v1.0

Also known as : Llama_3.x_70b_Dolmen_v1.0 (1.1 will come soon)

This model is essentially a Llama 3.1 smart brick based on by a 3.0->3.3 "port", to be used in second level merges.

I might abandon the 3 stages "smart merges" (like Smarteaz) because they are dilluting too much the source models used with the merge-stock technique once I add more models on the top of them. Even if the benches and PPL are good, and the prose as well, it ends up being too dilluted furthermore into the level 4/5 merges I'm doing afterwards.

So, this time, for the base, I used a Llama 3.0 Dolphin 2.9.1/Llama 3.3 instruct abliterated merge, in order to get both the capabilities of each model, and notably Dolphin, not ported on Llama 70b 3.1 or 3.3 by CognitiveComputations.

Then, I added the best 'instructions oriented' finetunes I know, simple as that.

The model is highly uncensored, quite intelligent, and can be used as a standalone.


merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the Model Stock merge method using Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: model_stock
models:
  - model: Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02
    parameters:
      weight: 1.0
  - model: nbeerbower/Llama-3.1-Nemotron-lorablated-70B
    parameters:
      weight: 1.0
  - model: mlabonne/Hermes-3-Llama-3.1-70B-lorablated
    parameters:
      weight: 1.0
  - model: huihui-ai/Llama-3.1-Tulu-3-70B-abliterated
    parameters:
      weight: 1.0
  - model: huihui-ai/Tess-R1-Limerick-Llama-3.1-70B-abliterated
    parameters:
      weight: 1.0
  - model: migtissera/Tess-3-Llama-3.1-70B
    parameters:
      weight: 1.0
base_model: Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02
dtype: bfloat16
out_dtype: bfloat16
parameters:
  int8_mask: true
  normalize: true
  rescale: false
  filter_wise: false
  smooth: false
  allow_negative_weights: false
chat_template: auto
tokenizer:
  source: union