cahlen commited on 27 days ago

Commit

1242396

verified ·

1 Parent(s): a1e5c94

Add fine-tuned DistilBERT model and card

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

README.md +128 -0
all_results.json +16 -0
checkpoint-1010/config.json +73 -0
checkpoint-1010/model.safetensors +3 -0
checkpoint-1010/optimizer.pt +3 -0
checkpoint-1010/rng_state.pth +3 -0
checkpoint-1010/scaler.pt +3 -0
checkpoint-1010/scheduler.pt +3 -0
checkpoint-1010/special_tokens_map.json +7 -0
checkpoint-1010/tokenizer.json +0 -0
checkpoint-1010/tokenizer_config.json +56 -0
checkpoint-1010/trainer_state.json +224 -0
checkpoint-1010/training_args.bin +3 -0
checkpoint-1010/vocab.txt +0 -0
checkpoint-808/config.json +73 -0
checkpoint-808/model.safetensors +3 -0
checkpoint-808/optimizer.pt +3 -0
checkpoint-808/rng_state.pth +3 -0
checkpoint-808/scaler.pt +3 -0
checkpoint-808/scheduler.pt +3 -0
checkpoint-808/special_tokens_map.json +7 -0
checkpoint-808/tokenizer.json +0 -0
checkpoint-808/tokenizer_config.json +56 -0
checkpoint-808/trainer_state.json +186 -0
checkpoint-808/training_args.bin +3 -0
checkpoint-808/vocab.txt +0 -0
config.json +73 -0
model.safetensors +3 -0
runs/Apr15_15-12-43_aigodmode/events.out.tfevents.1744755163.aigodmode.996209.0 +3 -0
runs/Apr15_15-13-03_aigodmode/events.out.tfevents.1744755184.aigodmode.996626.0 +3 -0
runs/Apr15_15-13-03_aigodmode/events.out.tfevents.1744755187.aigodmode.996626.1 +3 -0
runs/Apr15_17-19-20_aigodmode/events.out.tfevents.1744762761.aigodmode.1115972.0 +3 -0
runs/Apr15_17-19-20_aigodmode/events.out.tfevents.1744762765.aigodmode.1115972.1 +3 -0
runs/Apr15_18-20-00_aigodmode/events.out.tfevents.1744766401.aigodmode.1175888.0 +3 -0
runs/Apr15_18-20-00_aigodmode/events.out.tfevents.1744766405.aigodmode.1175888.1 +3 -0
runs/Apr15_18-41-07_aigodmode/events.out.tfevents.1744767667.aigodmode.1200966.0 +3 -0
runs/Apr15_18-41-07_aigodmode/events.out.tfevents.1744767675.aigodmode.1200966.1 +3 -0
runs/Apr15_19-39-18_aigodmode/events.out.tfevents.1744771158.aigodmode.1261440.0 +3 -0
runs/Apr15_19-39-18_aigodmode/events.out.tfevents.1744771170.aigodmode.1261440.1 +3 -0
runs/Apr15_19-41-09_aigodmode/events.out.tfevents.1744771270.aigodmode.1263282.0 +3 -0
runs/Apr15_19-41-09_aigodmode/events.out.tfevents.1744771274.aigodmode.1263282.1 +3 -0
runs/Apr15_20-08-54_aigodmode/events.out.tfevents.1744772934.aigodmode.1292055.0 +3 -0
runs/Apr15_20-08-54_aigodmode/events.out.tfevents.1744772939.aigodmode.1292055.1 +3 -0
runs/Apr15_22-03-10_aigodmode/events.out.tfevents.1744779790.aigodmode.1411307.0 +3 -0
runs/Apr15_22-03-10_aigodmode/events.out.tfevents.1744779797.aigodmode.1411307.1 +3 -0
runs/Apr15_23-20-35_aigodmode/events.out.tfevents.1744784435.aigodmode.1497045.0 +3 -0
runs/Apr15_23-20-35_aigodmode/events.out.tfevents.1744784442.aigodmode.1497045.1 +3 -0
runs/Apr15_23-21-41_aigodmode/events.out.tfevents.1744784502.aigodmode.1498147.0 +3 -0
runs/Apr15_23-21-41_aigodmode/events.out.tfevents.1744784513.aigodmode.1498147.1 +3 -0
runs/Apr15_23-22-43_aigodmode/events.out.tfevents.1744784564.aigodmode.1499184.0 +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,128 @@

+---
+language: en
+license: apache-2.0
+library_name: transformers
+tags:
+- distilbert
+- text-classification
+- token-classification
+- intent-classification
+- slot-filling
+- joint-intent-slot
+- smart-home
+- generated:Enfuse.io
+pipeline_tag: token-classification # Or text-classification, often token is used for joint models
+model-index:
+- name: distilbert-joint-intent-slot-smarthome
+  results:
+  - task:
+      type: token-classification # Represents the slot filling part
+      name: Slot Filling
+    dataset:
+      name: enfuse/joint-intent-slot-smarthome
+      type: enfuse/joint-intent-slot-smarthome
+      config: default
+      split: test
+    metrics:
+    - type: micro_f1
+      value: 0.8800
+      name: Slot F1 (Micro)
+    - type: precision
+      value: 0.8800
+      name: Slot Precision (Micro)
+    - type: recall
+      value: 0.8800
+      name: Slot Recall (Micro)
+  - task:
+      type: text-classification # Represents the intent part
+      name: Intent Classification
+    dataset:
+      name: enfuse/joint-intent-slot-smarthome
+      type: enfuse/joint-intent-slot-smarthome
+      config: default
+      split: test
+    metrics:
+    - type: accuracy
+      value: 0.9208
+      name: Intent Accuracy
+---
+# DistilBERT for Smart Home Joint Intent Classification and Slot Filling
+## Model Description
+**Produced By:** [Enfuse.io](https://enfuse.io/)
+This model is a fine-tuned version of `distilbert-base-uncased` specifically adapted for **joint intent classification and slot filling** in the **smart home domain**. Given a user command related to controlling smart home devices (like lights or thermostats), the model simultaneously predicts:
+1.  The user's **intent** (e.g., `set_device_state`, `get_device_state`).
+2.  The relevant **slots** (entities like `device_name`, `location`, `state`, `attribute_value`) within the command, using BIO tagging.
+## Intended Use and Limitations
+**Primary Intended Use:** This model is intended for **hobbyist experimentation and educational purposes** related to Natural Language Understanding (NLU) for smart home applications. It can be used as a baseline or starting point for understanding how to build NLU components for simple device control.
+**Disclaimer:** **This model is NOT intended for use in production environments.** Enfuse.io takes **no responsibility** for the performance, reliability, security, or any consequences arising from the use of this model in production systems or safety-critical applications. Use in such contexts is entirely at the user's own risk.
+**Out-of-Scope Use:**
+*   The model is not designed for general conversation or tasks outside the specific smart home intents and slots it was trained on.
+*   It has **no built-in mechanism for handling out-of-domain requests** (e.g., asking about weather, playing music). It will likely attempt to classify such requests into one of the known smart home intents, potentially leading to incorrect behavior.
+*   It has not been evaluated for fairness, bias, or robustness against adversarial inputs.
+## Training Data
+The model was fine-tuned on the `enfuse/joint-intent-slot-smarthome` dataset, specifically the `generated_smarthome_2016_unique.jsonl` version containing 2016 unique synthetic examples.
+This dataset was generated by Enfuse.io using a combination of `mistralai/Mistral-7B-Instruct-v0.1` and `openai/gpt-4o`, followed by validation and de-duplication. Please refer to the [dataset card](https://huggingface.co/datasets/enfuse/joint-intent-slot-smarthome) for more details on the data generation process and limitations.
+## Training Procedure
+### Preprocessing
+The text was tokenized using the `distilbert-base-uncased` tokenizer. Slot labels were converted to a BIO tagging scheme. Input sequences were padded and truncated to a maximum length of 128 tokens.
+### Fine-tuning
+The model was fine-tuned using the Hugging Face `transformers` library Trainer on a single NVIDIA RTX 5090.
+*   **Epochs:** 10
+*   **Batch Size:** 16 (per device)
+*   **Learning Rate:** 5e-5 (with linear decay)
+*   **Optimizer:** AdamW
+*   **Precision:** FP16
+*   **Dataset Split:** 80% Train (1612), 10% Validation (202), 10% Test (202)
+*   **Best Model Selection:** The checkpoint with the highest `eval_intent_accuracy` on the validation set during training was selected for the final model (corresponding to Epoch 8 or 10 in the 10-epoch run).
+## Evaluation Results
+The following results were achieved on the **test set** (202 examples) using the best checkpoint saved during training:
+*   **Intent Accuracy:** 92.08%
+*   **Slot F1 Score (Micro):** 88.00%
+*   **Slot Precision (Micro):** 88.00%
+*   **Slot Recall (Micro):** 88.00%
+*(Note: These results are specific to this particular training setup and may vary with different hyperparameters or training runs.)*
+## How to Use
+*(You would typically add code examples here showing how to load and use the model with the Transformers pipeline or custom code, similar to the logic in `infer.py`. Since the user requested no code, this section is omitted but would normally be present.)*
+## Model Card Contact
+[Enfuse.io](https://enfuse.io/)
+## Citation
+If you use this model, please cite the dataset:
+```bibtex
+@misc{enfuse_smarthome_intent_slot_2024,
+  author = {Enfuse.io},
+  title = {Enfuse Smart Home Joint Intent and Slot Filling Dataset},
+  year = {2024},
+  publisher = {Hugging Face},
+  journal = {Hugging Face Hub},
+  howpublished = {\url{https://huggingface.co/datasets/enfuse/joint-intent-slot-smarthome}}
+}
+```

all_results.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+    "epoch": 10.0,
+    "eval_intent_accuracy": 0.9207920792079208,
+    "eval_loss": 0.6127998232841492,
+    "eval_runtime": 0.0518,
+    "eval_samples_per_second": 3900.241,
+    "eval_slot_f1": 0.880019120458891,
+    "eval_slot_precision": 0.880019120458891,
+    "eval_slot_recall": 0.880019120458891,
+    "eval_steps_per_second": 77.233,
+    "total_flos": 526646237245440.0,
+    "train_loss": 0.5116730279261523,
+    "train_runtime": 22.7338,
+    "train_samples_per_second": 709.076,
+    "train_steps_per_second": 44.427
+}

checkpoint-1010/config.json ADDED Viewed

	@@ -0,0 +1,73 @@

+{
+  "activation": "gelu",
+  "architectures": [
+    "DistilBertForJointIntentSlotFilling"
+  ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
+  "id2intent_label": {
+    "0": "get_device_state",
+    "1": "set_device_attribute",
+    "2": "set_device_state"
+  },
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2"
+  },
+  "id2slot_label": {
+    "0": "O",
+    "1": "B-attribute_type",
+    "2": "I-attribute_type",
+    "3": "B-attribute_value",
+    "4": "I-attribute_value",
+    "5": "B-device_name",
+    "6": "I-device_name",
+    "7": "B-location",
+    "8": "I-location",
+    "9": "B-state",
+    "10": "I-state"
+  },
+  "initializer_range": 0.02,
+  "intent_label2id": {
+    "get_device_state": 0,
+    "set_device_attribute": 1,
+    "set_device_state": 2
+  },
+  "intent_loss_coef": 1.0,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_2": 2
+  },
+  "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
+  "num_intent_labels": 3,
+  "num_slot_labels": 11,
+  "pad_token_id": 0,
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": false,
+  "slot_label2id": {
+    "B-attribute_type": 1,
+    "B-attribute_value": 3,
+    "B-device_name": 5,
+    "B-location": 7,
+    "B-state": 9,
+    "I-attribute_type": 2,
+    "I-attribute_value": 4,
+    "I-device_name": 6,
+    "I-location": 8,
+    "I-state": 10,
+    "O": 0
+  },
+  "slot_loss_coef": 1.0,
+  "tie_weights_": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.51.3",
+  "vocab_size": 30522
+}

checkpoint-1010/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:93648248aa5f5b2bba475dd45e39523e87d8d476b2b0963e788330c821658952
+size 265507144

checkpoint-1010/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21b31a3bf879b8944c1933858eed98b7d0be7bacac83c83f16e99621ad542aa8
+size 531076747

checkpoint-1010/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2d82205c82ca0e59cbd17794777700714eabf1c99241e9d6f60a8374a27c5982
+size 14645

checkpoint-1010/scaler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ff4ab2e42d3da0b44d79c51f9ee188d20a00cf98ef55caa853236c82352c6032
+size 1383

checkpoint-1010/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c670f9ce3c4ad30a56141825dd15a2fa3016bb890950ac61a4a9a65bdbf58199
+size 1465

checkpoint-1010/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

checkpoint-1010/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-1010/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "DistilBertTokenizer",
+  "unk_token": "[UNK]"
+}

checkpoint-1010/trainer_state.json ADDED Viewed

	@@ -0,0 +1,224 @@

+{
+  "best_global_step": 808,
+  "best_metric": 0.9108910891089109,
+  "best_model_checkpoint": "./results_distilbert_custom/checkpoint-808",
+  "epoch": 10.0,
+  "eval_steps": 500,
+  "global_step": 1010,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.9900990099009901,
+      "grad_norm": 2.6412200927734375,
+      "learning_rate": 4.51980198019802e-05,
+      "loss": 1.0179,
+      "step": 100
+    },
+    {
+      "epoch": 1.0,
+      "eval_intent_accuracy": 0.900990099009901,
+      "eval_loss": 0.6627025008201599,
+      "eval_runtime": 0.0532,
+      "eval_samples_per_second": 3796.158,
+      "eval_slot_f1": 0.8583372039015328,
+      "eval_slot_precision": 0.8583372039015328,
+      "eval_slot_recall": 0.8583372039015328,
+      "eval_steps_per_second": 75.171,
+      "step": 101
+    },
+    {
+      "epoch": 1.9801980198019802,
+      "grad_norm": 1.697471022605896,
+      "learning_rate": 4.0247524752475254e-05,
+      "loss": 0.5933,
+      "step": 200
+    },
+    {
+      "epoch": 2.0,
+      "eval_intent_accuracy": 0.8910891089108911,
+      "eval_loss": 0.5899206399917603,
+      "eval_runtime": 0.0516,
+      "eval_samples_per_second": 3912.995,
+      "eval_slot_f1": 0.8471899674872271,
+      "eval_slot_precision": 0.8471899674872271,
+      "eval_slot_recall": 0.8471899674872271,
+      "eval_steps_per_second": 77.485,
+      "step": 202
+    },
+    {
+      "epoch": 2.9702970297029703,
+      "grad_norm": 3.684925079345703,
+      "learning_rate": 3.52970297029703e-05,
+      "loss": 0.5248,
+      "step": 300
+    },
+    {
+      "epoch": 3.0,
+      "eval_intent_accuracy": 0.900990099009901,
+      "eval_loss": 0.5877912640571594,
+      "eval_runtime": 0.0512,
+      "eval_samples_per_second": 3945.962,
+      "eval_slot_f1": 0.864375290292615,
+      "eval_slot_precision": 0.864375290292615,
+      "eval_slot_recall": 0.864375290292615,
+      "eval_steps_per_second": 78.138,
+      "step": 303
+    },
+    {
+      "epoch": 3.9603960396039604,
+      "grad_norm": 7.954131126403809,
+      "learning_rate": 3.0346534653465347e-05,
+      "loss": 0.489,
+      "step": 400
+    },
+    {
+      "epoch": 4.0,
+      "eval_intent_accuracy": 0.905940594059406,
+      "eval_loss": 0.6087009906768799,
+      "eval_runtime": 0.0514,
+      "eval_samples_per_second": 3928.963,
+      "eval_slot_f1": 0.864375290292615,
+      "eval_slot_precision": 0.864375290292615,
+      "eval_slot_recall": 0.864375290292615,
+      "eval_steps_per_second": 77.801,
+      "step": 404
+    },
+    {
+      "epoch": 4.9504950495049505,
+      "grad_norm": 5.173593044281006,
+      "learning_rate": 2.53960396039604e-05,
+      "loss": 0.468,
+      "step": 500
+    },
+    {
+      "epoch": 5.0,
+      "eval_intent_accuracy": 0.8762376237623762,
+      "eval_loss": 0.6051058173179626,
+      "eval_runtime": 0.052,
+      "eval_samples_per_second": 3887.446,
+      "eval_slot_f1": 0.8732001857872735,
+      "eval_slot_precision": 0.8732001857872735,
+      "eval_slot_recall": 0.8732001857872735,
+      "eval_steps_per_second": 76.979,
+      "step": 505
+    },
+    {
+      "epoch": 5.9405940594059405,
+      "grad_norm": 2.602719306945801,
+      "learning_rate": 2.0445544554455444e-05,
+      "loss": 0.4491,
+      "step": 600
+    },
+    {
+      "epoch": 6.0,
+      "eval_intent_accuracy": 0.905940594059406,
+      "eval_loss": 0.6407473683357239,
+      "eval_runtime": 0.0554,
+      "eval_samples_per_second": 3647.88,
+      "eval_slot_f1": 0.87598699489085,
+      "eval_slot_precision": 0.87598699489085,
+      "eval_slot_recall": 0.87598699489085,
+      "eval_steps_per_second": 72.235,
+      "step": 606
+    },
+    {
+      "epoch": 6.930693069306931,
+      "grad_norm": 2.991025924682617,
+      "learning_rate": 1.5495049504950496e-05,
+      "loss": 0.4204,
+      "step": 700
+    },
+    {
+      "epoch": 7.0,
+      "eval_intent_accuracy": 0.8910891089108911,
+      "eval_loss": 0.6234941482543945,
+      "eval_runtime": 0.0568,
+      "eval_samples_per_second": 3558.795,
+      "eval_slot_f1": 0.8773803994426381,
+      "eval_slot_precision": 0.8773803994426381,
+      "eval_slot_recall": 0.8773803994426381,
+      "eval_steps_per_second": 70.471,
+      "step": 707
+    },
+    {
+      "epoch": 7.920792079207921,
+      "grad_norm": 2.211575746536255,
+      "learning_rate": 1.0544554455445545e-05,
+      "loss": 0.4028,
+      "step": 800
+    },
+    {
+      "epoch": 8.0,
+      "eval_intent_accuracy": 0.9108910891089109,
+      "eval_loss": 0.65041583776474,
+      "eval_runtime": 0.0519,
+      "eval_samples_per_second": 3893.86,
+      "eval_slot_f1": 0.878309335810497,
+      "eval_slot_precision": 0.878309335810497,
+      "eval_slot_recall": 0.878309335810497,
+      "eval_steps_per_second": 77.106,
+      "step": 808
+    },
+    {
+      "epoch": 8.910891089108912,
+      "grad_norm": 1.5847089290618896,
+      "learning_rate": 5.594059405940594e-06,
+      "loss": 0.3929,
+      "step": 900
+    },
+    {
+      "epoch": 9.0,
+      "eval_intent_accuracy": 0.900990099009901,
+      "eval_loss": 0.6379386782646179,
+      "eval_runtime": 0.051,
+      "eval_samples_per_second": 3959.979,
+      "eval_slot_f1": 0.8769159312587088,
+      "eval_slot_precision": 0.8769159312587088,
+      "eval_slot_recall": 0.8769159312587088,
+      "eval_steps_per_second": 78.415,
+      "step": 909
+    },
+    {
+      "epoch": 9.900990099009901,
+      "grad_norm": 5.904131889343262,
+      "learning_rate": 6.435643564356436e-07,
+      "loss": 0.3679,
+      "step": 1000
+    },
+    {
+      "epoch": 10.0,
+      "eval_intent_accuracy": 0.9108910891089109,
+      "eval_loss": 0.6317320466041565,
+      "eval_runtime": 0.0518,
+      "eval_samples_per_second": 3899.99,
+      "eval_slot_f1": 0.8797027403622851,
+      "eval_slot_precision": 0.8797027403622851,
+      "eval_slot_recall": 0.8797027403622851,
+      "eval_steps_per_second": 77.228,
+      "step": 1010
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 1010,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 10,
+  "save_steps": 500,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": true
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 526646237245440.0,
+  "train_batch_size": 16,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-1010/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5460ae7fb569e4afe080919a3302f4475b1c884a06713458cae27af7d9e6a9cf
+size 5777

checkpoint-1010/vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-808/config.json ADDED Viewed

	@@ -0,0 +1,73 @@

+{
+  "activation": "gelu",
+  "architectures": [
+    "DistilBertForJointIntentSlotFilling"
+  ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
+  "id2intent_label": {
+    "0": "get_device_state",
+    "1": "set_device_attribute",
+    "2": "set_device_state"
+  },
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2"
+  },
+  "id2slot_label": {
+    "0": "O",
+    "1": "B-attribute_type",
+    "2": "I-attribute_type",
+    "3": "B-attribute_value",
+    "4": "I-attribute_value",
+    "5": "B-device_name",
+    "6": "I-device_name",
+    "7": "B-location",
+    "8": "I-location",
+    "9": "B-state",
+    "10": "I-state"
+  },
+  "initializer_range": 0.02,
+  "intent_label2id": {
+    "get_device_state": 0,
+    "set_device_attribute": 1,
+    "set_device_state": 2
+  },
+  "intent_loss_coef": 1.0,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_2": 2
+  },
+  "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
+  "num_intent_labels": 3,
+  "num_slot_labels": 11,
+  "pad_token_id": 0,
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": false,
+  "slot_label2id": {
+    "B-attribute_type": 1,
+    "B-attribute_value": 3,
+    "B-device_name": 5,
+    "B-location": 7,
+    "B-state": 9,
+    "I-attribute_type": 2,
+    "I-attribute_value": 4,
+    "I-device_name": 6,
+    "I-location": 8,
+    "I-state": 10,
+    "O": 0
+  },
+  "slot_loss_coef": 1.0,
+  "tie_weights_": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.51.3",
+  "vocab_size": 30522
+}

checkpoint-808/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0c31f1badc5c6e7b3bc137d8f9e31d51672a8f47fe10c383ffc5a60cff77e961
+size 265507144

checkpoint-808/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6d01715dea30e8e3316263b4112055a23d620f3992a717339486141fe5802689
+size 531076747

checkpoint-808/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b1d3c6c5729df1c06893524eebca8d445a73ee949ab0f5b8968bfde1fff2dd5e
+size 14645

checkpoint-808/scaler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c2309eb914966d75cf8e4e0e7271c74191a61b3d57d0e33e440d8ca2990543d8
+size 1383

checkpoint-808/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e624932d18ec9c281888584c8ffdcc866ed10ee6365627cd55003a9eeb6e9fc7
+size 1465

checkpoint-808/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

checkpoint-808/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-808/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "DistilBertTokenizer",
+  "unk_token": "[UNK]"
+}

checkpoint-808/trainer_state.json ADDED Viewed

	@@ -0,0 +1,186 @@

+{
+  "best_global_step": 808,
+  "best_metric": 0.9108910891089109,
+  "best_model_checkpoint": "./results_distilbert_custom/checkpoint-808",
+  "epoch": 8.0,
+  "eval_steps": 500,
+  "global_step": 808,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.9900990099009901,
+      "grad_norm": 2.6412200927734375,
+      "learning_rate": 4.51980198019802e-05,
+      "loss": 1.0179,
+      "step": 100
+    },
+    {
+      "epoch": 1.0,
+      "eval_intent_accuracy": 0.900990099009901,
+      "eval_loss": 0.6627025008201599,
+      "eval_runtime": 0.0532,
+      "eval_samples_per_second": 3796.158,
+      "eval_slot_f1": 0.8583372039015328,
+      "eval_slot_precision": 0.8583372039015328,
+      "eval_slot_recall": 0.8583372039015328,
+      "eval_steps_per_second": 75.171,
+      "step": 101
+    },
+    {
+      "epoch": 1.9801980198019802,
+      "grad_norm": 1.697471022605896,
+      "learning_rate": 4.0247524752475254e-05,
+      "loss": 0.5933,
+      "step": 200
+    },
+    {
+      "epoch": 2.0,
+      "eval_intent_accuracy": 0.8910891089108911,
+      "eval_loss": 0.5899206399917603,
+      "eval_runtime": 0.0516,
+      "eval_samples_per_second": 3912.995,
+      "eval_slot_f1": 0.8471899674872271,
+      "eval_slot_precision": 0.8471899674872271,
+      "eval_slot_recall": 0.8471899674872271,
+      "eval_steps_per_second": 77.485,
+      "step": 202
+    },
+    {
+      "epoch": 2.9702970297029703,
+      "grad_norm": 3.684925079345703,
+      "learning_rate": 3.52970297029703e-05,
+      "loss": 0.5248,
+      "step": 300
+    },
+    {
+      "epoch": 3.0,
+      "eval_intent_accuracy": 0.900990099009901,
+      "eval_loss": 0.5877912640571594,
+      "eval_runtime": 0.0512,
+      "eval_samples_per_second": 3945.962,
+      "eval_slot_f1": 0.864375290292615,
+      "eval_slot_precision": 0.864375290292615,
+      "eval_slot_recall": 0.864375290292615,
+      "eval_steps_per_second": 78.138,
+      "step": 303
+    },
+    {
+      "epoch": 3.9603960396039604,
+      "grad_norm": 7.954131126403809,
+      "learning_rate": 3.0346534653465347e-05,
+      "loss": 0.489,
+      "step": 400
+    },
+    {
+      "epoch": 4.0,
+      "eval_intent_accuracy": 0.905940594059406,
+      "eval_loss": 0.6087009906768799,
+      "eval_runtime": 0.0514,
+      "eval_samples_per_second": 3928.963,
+      "eval_slot_f1": 0.864375290292615,
+      "eval_slot_precision": 0.864375290292615,
+      "eval_slot_recall": 0.864375290292615,
+      "eval_steps_per_second": 77.801,
+      "step": 404
+    },
+    {
+      "epoch": 4.9504950495049505,
+      "grad_norm": 5.173593044281006,
+      "learning_rate": 2.53960396039604e-05,
+      "loss": 0.468,
+      "step": 500
+    },
+    {
+      "epoch": 5.0,
+      "eval_intent_accuracy": 0.8762376237623762,
+      "eval_loss": 0.6051058173179626,
+      "eval_runtime": 0.052,
+      "eval_samples_per_second": 3887.446,
+      "eval_slot_f1": 0.8732001857872735,
+      "eval_slot_precision": 0.8732001857872735,
+      "eval_slot_recall": 0.8732001857872735,
+      "eval_steps_per_second": 76.979,
+      "step": 505
+    },
+    {
+      "epoch": 5.9405940594059405,
+      "grad_norm": 2.602719306945801,
+      "learning_rate": 2.0445544554455444e-05,
+      "loss": 0.4491,
+      "step": 600
+    },
+    {
+      "epoch": 6.0,
+      "eval_intent_accuracy": 0.905940594059406,
+      "eval_loss": 0.6407473683357239,
+      "eval_runtime": 0.0554,
+      "eval_samples_per_second": 3647.88,
+      "eval_slot_f1": 0.87598699489085,
+      "eval_slot_precision": 0.87598699489085,
+      "eval_slot_recall": 0.87598699489085,
+      "eval_steps_per_second": 72.235,
+      "step": 606
+    },
+    {
+      "epoch": 6.930693069306931,
+      "grad_norm": 2.991025924682617,
+      "learning_rate": 1.5495049504950496e-05,
+      "loss": 0.4204,
+      "step": 700
+    },
+    {
+      "epoch": 7.0,
+      "eval_intent_accuracy": 0.8910891089108911,
+      "eval_loss": 0.6234941482543945,
+      "eval_runtime": 0.0568,
+      "eval_samples_per_second": 3558.795,
+      "eval_slot_f1": 0.8773803994426381,
+      "eval_slot_precision": 0.8773803994426381,
+      "eval_slot_recall": 0.8773803994426381,
+      "eval_steps_per_second": 70.471,
+      "step": 707
+    },
+    {
+      "epoch": 7.920792079207921,
+      "grad_norm": 2.211575746536255,
+      "learning_rate": 1.0544554455445545e-05,
+      "loss": 0.4028,
+      "step": 800
+    },
+    {
+      "epoch": 8.0,
+      "eval_intent_accuracy": 0.9108910891089109,
+      "eval_loss": 0.65041583776474,
+      "eval_runtime": 0.0519,
+      "eval_samples_per_second": 3893.86,
+      "eval_slot_f1": 0.878309335810497,
+      "eval_slot_precision": 0.878309335810497,
+      "eval_slot_recall": 0.878309335810497,
+      "eval_steps_per_second": 77.106,
+      "step": 808
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 1010,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 10,
+  "save_steps": 500,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 421316989796352.0,
+  "train_batch_size": 16,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-808/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5460ae7fb569e4afe080919a3302f4475b1c884a06713458cae27af7d9e6a9cf
+size 5777

checkpoint-808/vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

config.json ADDED Viewed

	@@ -0,0 +1,73 @@

+{
+  "activation": "gelu",
+  "architectures": [
+    "DistilBertForJointIntentSlotFilling"
+  ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
+  "id2intent_label": {
+    "0": "get_device_state",
+    "1": "set_device_attribute",
+    "2": "set_device_state"
+  },
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2"
+  },
+  "id2slot_label": {
+    "0": "O",
+    "1": "B-attribute_type",
+    "2": "I-attribute_type",
+    "3": "B-attribute_value",
+    "4": "I-attribute_value",
+    "5": "B-device_name",
+    "6": "I-device_name",
+    "7": "B-location",
+    "8": "I-location",
+    "9": "B-state",
+    "10": "I-state"
+  },
+  "initializer_range": 0.02,
+  "intent_label2id": {
+    "get_device_state": 0,
+    "set_device_attribute": 1,
+    "set_device_state": 2
+  },
+  "intent_loss_coef": 1.0,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_2": 2
+  },
+  "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
+  "num_intent_labels": 3,
+  "num_slot_labels": 11,
+  "pad_token_id": 0,
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": false,
+  "slot_label2id": {
+    "B-attribute_type": 1,
+    "B-attribute_value": 3,
+    "B-device_name": 5,
+    "B-location": 7,
+    "B-state": 9,
+    "I-attribute_type": 2,
+    "I-attribute_value": 4,
+    "I-device_name": 6,
+    "I-location": 8,
+    "I-state": 10,
+    "O": 0
+  },
+  "slot_loss_coef": 1.0,
+  "tie_weights_": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.51.3",
+  "vocab_size": 30522
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0c31f1badc5c6e7b3bc137d8f9e31d51672a8f47fe10c383ffc5a60cff77e961
+size 265507144

runs/Apr15_15-12-43_aigodmode/events.out.tfevents.1744755163.aigodmode.996209.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f35b0c3307c8edce30c0e71cb9c3ead35b3fe25a0332f9781be0a689438b6182
+size 5938

runs/Apr15_15-13-03_aigodmode/events.out.tfevents.1744755184.aigodmode.996626.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a7c22901362b8d9f274dce0f236ea66c22cac503e44622e32bb092e7b73bb8ca
+size 7740

runs/Apr15_15-13-03_aigodmode/events.out.tfevents.1744755187.aigodmode.996626.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ff2e00c56c39064e4852c9e87175f4bbd0bc89dc24ed3606106eb3b0d4c21ae1
+size 573

runs/Apr15_17-19-20_aigodmode/events.out.tfevents.1744762761.aigodmode.1115972.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a50df85af1b5d0a3f06e567d52910b8be7c9f801e431725a3b5d01710f55082e
+size 7962

runs/Apr15_17-19-20_aigodmode/events.out.tfevents.1744762765.aigodmode.1115972.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:399ecf3dc37daf2dd0923162f5c0e3147dd7bc437b82162a14643109544776f3
+size 582

runs/Apr15_18-20-00_aigodmode/events.out.tfevents.1744766401.aigodmode.1175888.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2ce186316ae6eb1fc9acdb21cbdf157926cb80e74093de0f1194d7745f855c07
+size 7962

runs/Apr15_18-20-00_aigodmode/events.out.tfevents.1744766405.aigodmode.1175888.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8503cfe44841a0fb1e6e276c964e55d9584a406d86d89e646bbc9026d4eec916
+size 582

runs/Apr15_18-41-07_aigodmode/events.out.tfevents.1744767667.aigodmode.1200966.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b3f5508d3cac29164527cf1bfda59632f225f685b58a503c84e2c384ba15143c
+size 8393

runs/Apr15_18-41-07_aigodmode/events.out.tfevents.1744767675.aigodmode.1200966.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c4af9c5c60777f79dac5ec4bdf01168be4fa51925c7e771776a460d2fa17397f
+size 582

runs/Apr15_19-39-18_aigodmode/events.out.tfevents.1744771158.aigodmode.1261440.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:caf00a06717da11a4159c820fe56c91fe07a278da445f5cfe8e173ea9aac7aea
+size 9035

runs/Apr15_19-39-18_aigodmode/events.out.tfevents.1744771170.aigodmode.1261440.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a131f7bf860686603c5a621154634d8b823293703fb9c4d5e9a300de23e67e9
+size 582

runs/Apr15_19-41-09_aigodmode/events.out.tfevents.1744771270.aigodmode.1263282.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:43492f9f7312018fa2d91feda6fdfc71d54df79c61ba9c71722b4ccb0a786cc0
+size 7962

runs/Apr15_19-41-09_aigodmode/events.out.tfevents.1744771274.aigodmode.1263282.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bf0328a70b832431662157df678c2669601f975d5c7640d0f41ceedd753470e9
+size 582

runs/Apr15_20-08-54_aigodmode/events.out.tfevents.1744772934.aigodmode.1292055.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:87b67a926bafbf632bccd8d5b8f9fa3decfcd16dd13b0b0c1cc19bc9182f3272
+size 7962

runs/Apr15_20-08-54_aigodmode/events.out.tfevents.1744772939.aigodmode.1292055.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:28519c868715aca438a32fa2a9607390d9c12ed964f671b2e2818f1e3a3cc7b1
+size 582

runs/Apr15_22-03-10_aigodmode/events.out.tfevents.1744779790.aigodmode.1411307.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bc6baddd8e8f1fe4a3177de09a0de4b5d7c051eda63b0d572d0ecc14b73fc3c6
+size 8182

runs/Apr15_22-03-10_aigodmode/events.out.tfevents.1744779797.aigodmode.1411307.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:af3f3f15bd5aa8ffe18a4f4a9fe299ce211a8efcd994ad6578dc0a90a7cf296c
+size 582

runs/Apr15_23-20-35_aigodmode/events.out.tfevents.1744784435.aigodmode.1497045.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:18aff3b782064c46c9073ee0c584e2d8c78c7d74ab1290a28f77322853a2828a
+size 8393

runs/Apr15_23-20-35_aigodmode/events.out.tfevents.1744784442.aigodmode.1497045.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a5056c63e5751b899ff0eb8db8ea3c6cc113dc812cd2033a7f1922a6449f4b8c
+size 582

runs/Apr15_23-21-41_aigodmode/events.out.tfevents.1744784502.aigodmode.1498147.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:44198eda7e8f8942f947fe63aaeedc680e6100dccb2caa9c5b53ea2f144e52aa
+size 9803

runs/Apr15_23-21-41_aigodmode/events.out.tfevents.1744784513.aigodmode.1498147.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a30fbfd74417bed3ed6c0462bc78c7604635f922db61eb06b80cde1fdf063463
+size 582

runs/Apr15_23-22-43_aigodmode/events.out.tfevents.1744784564.aigodmode.1499184.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:630efb1b5574425843d6f147f63b66edd4ac7d9c654093d43667c97f0c0640ca
+size 13329