2025-04-12 20:48:03,485 - INFO - Starting GLiNER fine-tuning at 20250412_204803 2025-04-12 20:48:03,485 - INFO - Output directory: gliner_finetuned_v3 2025-04-12 20:48:03,485 - INFO - Loading data from ./gliner_dataset_fixed.json... 2025-04-12 20:48:07,737 - INFO - Successfully loaded 32439 samples from ./gliner_dataset_fixed.json 2025-04-12 20:48:07,756 - INFO - Sequence Length Analysis: 2025-04-12 20:48:07,756 - INFO - Minimum length: 246 2025-04-12 20:48:07,756 - INFO - Maximum length: 1286 2025-04-12 20:48:07,756 - INFO - Mean length: 450.5 2025-04-12 20:48:07,756 - INFO - Median length: 447 2025-04-12 20:48:07,756 - INFO - 90th percentile: 540 2025-04-12 20:48:07,756 - INFO - 95th percentile: 567 2025-04-12 20:48:07,756 - INFO - 99th percentile: 636 2025-04-12 20:48:07,756 - INFO - Sequences exceeding 1024 tokens: 19 2025-04-12 20:48:07,756 - INFO - Sequences exceeding 2048 tokens: 0 2025-04-12 20:48:07,756 - INFO - Recommended max_len setting: 896 2025-04-12 20:48:07,756 - INFO - Using maximum sequence length: 896 2025-04-12 20:48:07,820 - INFO - Extracted entity types: ['AGE', 'AGE_INFO', 'ANIMAL_INFO', 'BEHAVIORAL_PATTERN', 'CONTEXT_SENSITIVE', 'CRIMINAL_RECORD', 'DATE_TIME', 'ECONOMIC_STATUS', 'EMAIL_ADDRESS', 'EMPLOYMENT_INFO', 'FAMILY_RELATION', 'FINANCIAL_INFO', 'GOV_ID', 'HEALTH_INFO', 'IDENTIFIABLE_IMAGE', 'NO_ADDRESS', 'NO_PHONE_NUMBER', 'PERSON', 'POLITICAL_CASE', 'POSTAL_CODE', 'SEXUAL_ORIENTATION'] 2025-04-12 20:48:07,924 - INFO - Dataset Statistics: 2025-04-12 20:48:07,924 - INFO - Total samples: 32439 2025-04-12 20:48:07,924 - INFO - Samples with entities: 32439 (100.0%) 2025-04-12 20:48:07,924 - INFO - Total entities: 583306 2025-04-12 20:48:07,924 - INFO - Average entities per sample: 17.98 2025-04-12 20:48:07,924 - INFO - Entity type distribution: 2025-04-12 20:48:07,925 - INFO - PERSON: 80730 (13.8%) 2025-04-12 20:48:07,925 - INFO - DATE_TIME: 76795 (13.2%) 2025-04-12 20:48:07,925 - INFO - HEALTH_INFO: 52559 (9.0%) 2025-04-12 20:48:07,925 - INFO - GOV_ID: 47823 (8.2%) 2025-04-12 20:48:07,925 - INFO - NO_ADDRESS: 45076 (7.7%) 2025-04-12 20:48:07,925 - INFO - CRIMINAL_RECORD: 37081 (6.4%) 2025-04-12 20:48:07,925 - INFO - NO_PHONE_NUMBER: 32198 (5.5%) 2025-04-12 20:48:07,925 - INFO - EMAIL_ADDRESS: 30921 (5.3%) 2025-04-12 20:48:07,925 - INFO - FAMILY_RELATION: 29615 (5.1%) 2025-04-12 20:48:07,925 - INFO - CONTEXT_SENSITIVE: 23812 (4.1%) 2025-04-12 20:48:07,925 - INFO - EMPLOYMENT_INFO: 22452 (3.8%) 2025-04-12 20:48:07,925 - INFO - FINANCIAL_INFO: 20784 (3.6%) 2025-04-12 20:48:07,925 - INFO - POLITICAL_CASE: 18895 (3.2%) 2025-04-12 20:48:07,925 - INFO - BEHAVIORAL_PATTERN: 15996 (2.7%) 2025-04-12 20:48:07,925 - INFO - ECONOMIC_STATUS: 14990 (2.6%) 2025-04-12 20:48:07,925 - INFO - IDENTIFIABLE_IMAGE: 14825 (2.5%) 2025-04-12 20:48:07,925 - INFO - SEXUAL_ORIENTATION: 11758 (2.0%) 2025-04-12 20:48:07,925 - INFO - POSTAL_CODE: 6985 (1.2%) 2025-04-12 20:48:07,925 - INFO - ANIMAL_INFO: 6 (0.0%) 2025-04-12 20:48:07,925 - INFO - AGE: 3 (0.0%) 2025-04-12 20:48:07,925 - INFO - AGE_INFO: 2 (0.0%) 2025-04-12 20:48:07,943 - INFO - Dataset split: 29195 training samples, 3244 validation samples 2025-04-12 20:48:07,943 - INFO - Starting Norwegian data augmentation with factor 0.5... 2025-04-12 20:48:12,310 - INFO - Collected entity examples for 19 entity types 2025-04-12 20:48:12,311 - INFO - PERSON: 968 examples 2025-04-12 20:48:12,311 - INFO - DATE_TIME: 1321 examples 2025-04-12 20:48:12,311 - INFO - NO_ADDRESS: 5973 examples 2025-04-12 20:48:12,311 - INFO - ECONOMIC_STATUS: 2722 examples 2025-04-12 20:48:12,311 - INFO - EMAIL_ADDRESS: 5939 examples 2025-04-12 20:48:12,311 - INFO - IDENTIFIABLE_IMAGE: 1800 examples 2025-04-12 20:48:12,311 - INFO - CONTEXT_SENSITIVE: 7677 examples 2025-04-12 20:48:12,311 - INFO - BEHAVIORAL_PATTERN: 8494 examples 2025-04-12 20:48:12,311 - INFO - EMPLOYMENT_INFO: 2963 examples 2025-04-12 20:48:12,311 - INFO - POLITICAL_CASE: 1856 examples 2025-04-12 20:48:12,311 - INFO - NO_PHONE_NUMBER: 554 examples 2025-04-12 20:48:12,311 - INFO - FINANCIAL_INFO: 1451 examples 2025-04-12 20:48:12,311 - INFO - HEALTH_INFO: 4910 examples 2025-04-12 20:48:12,311 - INFO - CRIMINAL_RECORD: 4180 examples 2025-04-12 20:48:12,311 - INFO - FAMILY_RELATION: 1328 examples 2025-04-12 20:48:12,311 - INFO - POSTAL_CODE: 155 examples 2025-04-12 20:48:12,311 - INFO - SEXUAL_ORIENTATION: 159 examples 2025-04-12 20:48:12,311 - INFO - GOV_ID: 1171 examples 2025-04-12 20:48:12,311 - INFO - AGE_INFO: 2 examples 2025-04-12 20:48:27,418 - INFO - Created 13717 augmented samples 2025-04-12 20:48:27,430 - INFO - Total training samples after augmentation: 42912 2025-04-12 20:48:27,441 - INFO - Training dataset after augmentation: 42912 samples 2025-04-12 20:48:28,996 - INFO - Successfully imported GLiNER components 2025-04-12 20:48:29,029 - INFO - Using device: cuda 2025-04-12 20:48:29,029 - INFO - Estimated GPU memory requirement: 1.1 GB with batch size 4 2025-04-12 20:48:29,029 - INFO - Loading base model: urchade/gliner_multi_pii-v1 with max_length=896 2025-04-12 20:48:34,795 - INFO - Updating model configuration with max_len=896 2025-04-12 20:48:35,166 - INFO - Starting enhanced training for 1 epochs... 2025-04-12 20:48:35,166 - INFO - Training with batch size 4 × 4 gradient accumulation steps = effective batch size of 16 2025-04-12 20:48:35,166 - INFO - Learning rate: 2e-05, Label smoothing: 0.1 2025-04-12 20:48:35,167 - INFO - Mixed precision training: Enabled 2025-04-12 20:48:35,633 - INFO - Initialized adversarial training with epsilon=0.5 2025-04-12 20:48:48,917 - INFO - Epoch: 0.00 | loss: 392.654000 | grad_norm: inf | learning_rate: 0.000000 2025-04-12 20:49:00,192 - INFO - Epoch: 0.01 | loss: 379.110400 | grad_norm: 13870.203125 | learning_rate: 0.000002 2025-04-12 20:49:11,720 - INFO - Epoch: 0.01 | loss: 317.083600 | grad_norm: 11069.067383 | learning_rate: 0.000004 2025-04-12 20:49:23,468 - INFO - Epoch: 0.01 | loss: 210.984300 | grad_norm: 6763.325195 | learning_rate: 0.000005 2025-04-12 20:49:35,026 - INFO - Epoch: 0.02 | loss: 147.771300 | grad_norm: 2895.516846 | learning_rate: 0.000007 2025-04-12 20:49:47,481 - INFO - Epoch: 0.02 | loss: 132.513400 | grad_norm: 2268.081055 | learning_rate: 0.000009 2025-04-12 20:49:58,638 - INFO - Epoch: 0.03 | loss: 113.679100 | grad_norm: 1176.167358 | learning_rate: 0.000011 2025-04-12 20:50:10,040 - INFO - Epoch: 0.03 | loss: 111.937900 | grad_norm: 1103.858765 | learning_rate: 0.000013 2025-04-12 20:50:21,999 - INFO - Epoch: 0.03 | loss: 112.758300 | grad_norm: 1099.602783 | learning_rate: 0.000015 2025-04-12 20:50:46,960 - INFO - Starting GLiNER fine-tuning at 20250412_205046 2025-04-12 20:50:46,960 - INFO - Output directory: gliner_finetuned_v3 2025-04-12 20:50:46,960 - INFO - Loading data from ./gliner_dataset_fixed.json... 2025-04-12 20:50:51,196 - INFO - Successfully loaded 32439 samples from ./gliner_dataset_fixed.json 2025-04-12 20:50:51,208 - INFO - Sequence Length Analysis: 2025-04-12 20:50:51,208 - INFO - Minimum length: 246 2025-04-12 20:50:51,209 - INFO - Maximum length: 1286 2025-04-12 20:50:51,209 - INFO - Mean length: 450.5 2025-04-12 20:50:51,209 - INFO - Median length: 447 2025-04-12 20:50:51,209 - INFO - 90th percentile: 540 2025-04-12 20:50:51,209 - INFO - 95th percentile: 567 2025-04-12 20:50:51,209 - INFO - 99th percentile: 636 2025-04-12 20:50:51,209 - INFO - Sequences exceeding 1024 tokens: 19 2025-04-12 20:50:51,209 - INFO - Sequences exceeding 2048 tokens: 0 2025-04-12 20:50:51,209 - INFO - Recommended max_len setting: 896 2025-04-12 20:50:51,209 - INFO - Using maximum sequence length: 896 2025-04-12 20:50:51,265 - INFO - Extracted entity types: ['AGE', 'AGE_INFO', 'ANIMAL_INFO', 'BEHAVIORAL_PATTERN', 'CONTEXT_SENSITIVE', 'CRIMINAL_RECORD', 'DATE_TIME', 'ECONOMIC_STATUS', 'EMAIL_ADDRESS', 'EMPLOYMENT_INFO', 'FAMILY_RELATION', 'FINANCIAL_INFO', 'GOV_ID', 'HEALTH_INFO', 'IDENTIFIABLE_IMAGE', 'NO_ADDRESS', 'NO_PHONE_NUMBER', 'PERSON', 'POLITICAL_CASE', 'POSTAL_CODE', 'SEXUAL_ORIENTATION'] 2025-04-12 20:50:51,403 - INFO - Dataset Statistics: 2025-04-12 20:50:51,403 - INFO - Total samples: 32439 2025-04-12 20:50:51,403 - INFO - Samples with entities: 32439 (100.0%) 2025-04-12 20:50:51,403 - INFO - Total entities: 583306 2025-04-12 20:50:51,403 - INFO - Average entities per sample: 17.98 2025-04-12 20:50:51,403 - INFO - Entity type distribution: 2025-04-12 20:50:51,403 - INFO - PERSON: 80730 (13.8%) 2025-04-12 20:50:51,403 - INFO - DATE_TIME: 76795 (13.2%) 2025-04-12 20:50:51,403 - INFO - HEALTH_INFO: 52559 (9.0%) 2025-04-12 20:50:51,403 - INFO - GOV_ID: 47823 (8.2%) 2025-04-12 20:50:51,403 - INFO - NO_ADDRESS: 45076 (7.7%) 2025-04-12 20:50:51,403 - INFO - CRIMINAL_RECORD: 37081 (6.4%) 2025-04-12 20:50:51,403 - INFO - NO_PHONE_NUMBER: 32198 (5.5%) 2025-04-12 20:50:51,403 - INFO - EMAIL_ADDRESS: 30921 (5.3%) 2025-04-12 20:50:51,403 - INFO - FAMILY_RELATION: 29615 (5.1%) 2025-04-12 20:50:51,403 - INFO - CONTEXT_SENSITIVE: 23812 (4.1%) 2025-04-12 20:50:51,403 - INFO - EMPLOYMENT_INFO: 22452 (3.8%) 2025-04-12 20:50:51,403 - INFO - FINANCIAL_INFO: 20784 (3.6%) 2025-04-12 20:50:51,403 - INFO - POLITICAL_CASE: 18895 (3.2%) 2025-04-12 20:50:51,403 - INFO - BEHAVIORAL_PATTERN: 15996 (2.7%) 2025-04-12 20:50:51,403 - INFO - ECONOMIC_STATUS: 14990 (2.6%) 2025-04-12 20:50:51,403 - INFO - IDENTIFIABLE_IMAGE: 14825 (2.5%) 2025-04-12 20:50:51,403 - INFO - SEXUAL_ORIENTATION: 11758 (2.0%) 2025-04-12 20:50:51,403 - INFO - POSTAL_CODE: 6985 (1.2%) 2025-04-12 20:50:51,403 - INFO - ANIMAL_INFO: 6 (0.0%) 2025-04-12 20:50:51,403 - INFO - AGE: 3 (0.0%) 2025-04-12 20:50:51,403 - INFO - AGE_INFO: 2 (0.0%) 2025-04-12 20:50:51,452 - INFO - Dataset split: 29195 training samples, 3244 validation samples 2025-04-12 20:50:51,452 - INFO - Starting Norwegian data augmentation with factor 0.5... 2025-04-12 20:50:55,897 - INFO - Collected entity examples for 20 entity types 2025-04-12 20:50:55,898 - INFO - PERSON: 970 examples 2025-04-12 20:50:55,898 - INFO - DATE_TIME: 1329 examples 2025-04-12 20:50:55,898 - INFO - NO_ADDRESS: 5954 examples 2025-04-12 20:50:55,898 - INFO - NO_PHONE_NUMBER: 545 examples 2025-04-12 20:50:55,898 - INFO - EMAIL_ADDRESS: 5940 examples 2025-04-12 20:50:55,898 - INFO - IDENTIFIABLE_IMAGE: 1823 examples 2025-04-12 20:50:55,898 - INFO - BEHAVIORAL_PATTERN: 8557 examples 2025-04-12 20:50:55,898 - INFO - GOV_ID: 1140 examples 2025-04-12 20:50:55,898 - INFO - EMPLOYMENT_INFO: 2943 examples 2025-04-12 20:50:55,898 - INFO - CRIMINAL_RECORD: 4165 examples 2025-04-12 20:50:55,898 - INFO - FINANCIAL_INFO: 1452 examples 2025-04-12 20:50:55,898 - INFO - CONTEXT_SENSITIVE: 7723 examples 2025-04-12 20:50:55,898 - INFO - HEALTH_INFO: 4908 examples 2025-04-12 20:50:55,898 - INFO - FAMILY_RELATION: 1330 examples 2025-04-12 20:50:55,898 - INFO - POLITICAL_CASE: 1854 examples 2025-04-12 20:50:55,898 - INFO - ECONOMIC_STATUS: 2717 examples 2025-04-12 20:50:55,898 - INFO - SEXUAL_ORIENTATION: 147 examples 2025-04-12 20:50:55,898 - INFO - POSTAL_CODE: 147 examples 2025-04-12 20:50:55,898 - INFO - AGE_INFO: 2 examples 2025-04-12 20:50:55,898 - INFO - ANIMAL_INFO: 1 examples 2025-04-12 20:51:11,070 - INFO - Created 13695 augmented samples 2025-04-12 20:51:11,087 - INFO - Total training samples after augmentation: 42890 2025-04-12 20:51:11,103 - INFO - Training dataset after augmentation: 42890 samples 2025-04-12 20:51:12,679 - INFO - Successfully imported GLiNER components 2025-04-12 20:51:12,714 - INFO - Using device: cuda 2025-04-12 20:51:12,715 - INFO - Estimated GPU memory requirement: 1.1 GB with batch size 4 2025-04-12 20:51:12,715 - INFO - Loading base model: urchade/gliner_multi_pii-v1 with max_length=896 2025-04-12 20:51:18,308 - INFO - Updating model configuration with max_len=896 2025-04-12 20:51:18,652 - INFO - Starting enhanced training for 3 epochs... 2025-04-12 20:51:18,652 - INFO - Training with batch size 4 × 4 gradient accumulation steps = effective batch size of 16 2025-04-12 20:51:18,652 - INFO - Learning rate: 2e-05, Label smoothing: 0.1 2025-04-12 20:51:18,652 - INFO - Mixed precision training: Enabled 2025-04-12 20:51:19,225 - INFO - Initialized adversarial training with epsilon=0.5 2025-04-12 20:51:32,676 - INFO - Epoch: 0.00 | loss: 405.603000 | grad_norm: 21545.423828 | learning_rate: 0.000000 2025-04-12 20:51:44,627 - INFO - Epoch: 0.01 | loss: 415.944400 | grad_norm: 21178.679688 | learning_rate: 0.000001 2025-04-12 20:51:55,972 - INFO - Epoch: 0.01 | loss: 353.536300 | grad_norm: 16303.899414 | learning_rate: 0.000001 2025-04-12 20:52:07,371 - INFO - Epoch: 0.01 | loss: 329.995500 | grad_norm: 14056.744141 | learning_rate: 0.000002 2025-04-12 20:52:19,430 - INFO - Epoch: 0.02 | loss: 257.883200 | grad_norm: 9467.385742 | learning_rate: 0.000002 2025-04-12 20:52:31,269 - INFO - Epoch: 0.02 | loss: 201.636100 | grad_norm: 4764.615723 | learning_rate: 0.000003 2025-04-12 20:52:42,592 - INFO - Epoch: 0.03 | loss: 155.169300 | grad_norm: 3065.785400 | learning_rate: 0.000004 2025-04-12 20:52:54,292 - INFO - Epoch: 0.03 | loss: 143.750700 | grad_norm: 1665.037842 | learning_rate: 0.000004 2025-04-12 20:53:05,962 - INFO - Epoch: 0.03 | loss: 131.419400 | grad_norm: 1096.521606 | learning_rate: 0.000005 2025-04-12 20:53:17,624 - INFO - Epoch: 0.04 | loss: 128.127700 | grad_norm: 1112.906250 | learning_rate: 0.000006 2025-04-12 20:53:30,218 - INFO - Epoch: 0.04 | loss: 120.158500 | grad_norm: 1602.173462 | learning_rate: 0.000006 2025-04-12 20:53:42,000 - INFO - Epoch: 0.04 | loss: 111.603100 | grad_norm: 825.809204 | learning_rate: 0.000007 2025-04-12 20:53:54,096 - INFO - Epoch: 0.05 | loss: 110.229300 | grad_norm: 1255.956909 | learning_rate: 0.000007 2025-04-12 20:54:05,995 - INFO - Epoch: 0.05 | loss: 110.914200 | grad_norm: 1099.850220 | learning_rate: 0.000008 2025-04-12 20:54:18,328 - INFO - Epoch: 0.06 | loss: 103.508600 | grad_norm: 859.139038 | learning_rate: 0.000009 2025-04-12 20:54:29,390 - INFO - Epoch: 0.06 | loss: 105.911900 | grad_norm: 1666.134277 | learning_rate: 0.000009 2025-04-12 20:54:40,845 - INFO - Epoch: 0.06 | loss: 101.510900 | grad_norm: 795.450012 | learning_rate: 0.000010 2025-04-12 20:54:52,448 - INFO - Epoch: 0.07 | loss: 102.609600 | grad_norm: 891.243103 | learning_rate: 0.000011 2025-04-12 20:55:03,661 - INFO - Epoch: 0.07 | loss: 98.540200 | grad_norm: 1045.907349 | learning_rate: 0.000011 2025-04-12 20:55:15,606 - INFO - Epoch: 0.07 | loss: 100.119900 | grad_norm: 732.970520 | learning_rate: 0.000012 2025-04-12 20:55:27,047 - INFO - Epoch: 0.08 | loss: 95.121000 | grad_norm: 1241.019165 | learning_rate: 0.000012 2025-04-12 20:55:38,781 - INFO - Epoch: 0.08 | loss: 96.133600 | grad_norm: 760.192627 | learning_rate: 0.000013 2025-04-12 20:55:51,146 - INFO - Epoch: 0.09 | loss: 99.402400 | grad_norm: 609.955688 | learning_rate: 0.000014 2025-04-12 20:56:03,081 - INFO - Epoch: 0.09 | loss: 94.356600 | grad_norm: 1220.211670 | learning_rate: 0.000014 2025-04-12 20:56:14,487 - INFO - Epoch: 0.09 | loss: 97.165900 | grad_norm: 887.024902 | learning_rate: 0.000015 2025-04-12 20:56:26,041 - INFO - Epoch: 0.10 | loss: 94.404300 | grad_norm: 630.761658 | learning_rate: 0.000015 2025-04-12 20:56:37,329 - INFO - Epoch: 0.10 | loss: 93.991400 | grad_norm: 881.988831 | learning_rate: 0.000016 2025-04-12 20:56:49,243 - INFO - Epoch: 0.10 | loss: 94.250000 | grad_norm: 650.803467 | learning_rate: 0.000017 2025-04-12 20:57:00,442 - INFO - Epoch: 0.11 | loss: 93.483500 | grad_norm: 587.986145 | learning_rate: 0.000017 2025-04-12 20:57:11,985 - INFO - Epoch: 0.11 | loss: 95.563700 | grad_norm: 605.347351 | learning_rate: 0.000018 2025-04-12 20:57:23,916 - INFO - Epoch: 0.12 | loss: 96.856800 | grad_norm: 973.164673 | learning_rate: 0.000019 2025-04-12 20:57:34,966 - INFO - Epoch: 0.12 | loss: 91.164500 | grad_norm: 585.658875 | learning_rate: 0.000019 2025-04-12 20:57:46,826 - INFO - Epoch: 0.12 | loss: 95.141600 | grad_norm: 624.348755 | learning_rate: 0.000020 2025-04-12 20:57:59,166 - INFO - Epoch: 0.13 | loss: 91.668200 | grad_norm: 543.129700 | learning_rate: 0.000020 2025-04-12 20:58:11,059 - INFO - Epoch: 0.13 | loss: 93.141700 | grad_norm: 849.775513 | learning_rate: 0.000021 2025-04-12 20:58:22,916 - INFO - Epoch: 0.13 | loss: 94.467300 | grad_norm: 441.896576 | learning_rate: 0.000022 2025-04-12 20:58:34,570 - INFO - Epoch: 0.14 | loss: 90.555100 | grad_norm: 439.447510 | learning_rate: 0.000022 2025-04-12 20:58:46,488 - INFO - Epoch: 0.14 | loss: 93.499700 | grad_norm: 566.885071 | learning_rate: 0.000023 2025-04-12 20:58:58,248 - INFO - Epoch: 0.15 | loss: 87.544600 | grad_norm: 442.563599 | learning_rate: 0.000024 2025-04-12 20:59:10,294 - INFO - Epoch: 0.15 | loss: 88.274700 | grad_norm: 558.198425 | learning_rate: 0.000024 2025-04-12 20:59:22,804 - INFO - Epoch: 0.15 | loss: 95.382300 | grad_norm: 729.605896 | learning_rate: 0.000025 2025-04-12 20:59:34,049 - INFO - Epoch: 0.16 | loss: 90.204900 | grad_norm: 365.200714 | learning_rate: 0.000025 2025-04-12 20:59:46,122 - INFO - Epoch: 0.16 | loss: 90.138000 | grad_norm: 417.370453 | learning_rate: 0.000026 2025-04-12 20:59:58,059 - INFO - Epoch: 0.16 | loss: 86.711200 | grad_norm: 563.836914 | learning_rate: 0.000027 2025-04-12 21:00:09,917 - INFO - Epoch: 0.17 | loss: 93.703800 | grad_norm: 614.524780 | learning_rate: 0.000027 2025-04-12 21:00:21,751 - INFO - Epoch: 0.17 | loss: 89.057700 | grad_norm: 606.132263 | learning_rate: 0.000028 2025-04-12 21:00:33,628 - INFO - Epoch: 0.18 | loss: 86.730400 | grad_norm: 439.268524 | learning_rate: 0.000029 2025-04-12 21:00:46,056 - INFO - Epoch: 0.18 | loss: 88.190500 | grad_norm: 823.925598 | learning_rate: 0.000029 2025-04-12 21:00:58,303 - INFO - Epoch: 0.18 | loss: 90.493200 | grad_norm: 495.837616 | learning_rate: 0.000030 2025-04-12 21:01:10,211 - INFO - Epoch: 0.19 | loss: 88.979400 | grad_norm: 467.842590 | learning_rate: 0.000030 2025-04-12 21:01:21,890 - INFO - Epoch: 0.19 | loss: 92.654400 | grad_norm: 539.052185 | learning_rate: 0.000031 2025-04-12 21:01:33,621 - INFO - Epoch: 0.19 | loss: 86.985400 | grad_norm: 394.868439 | learning_rate: 0.000032 2025-04-12 21:01:45,341 - INFO - Epoch: 0.20 | loss: 89.565100 | grad_norm: 580.561340 | learning_rate: 0.000032 2025-04-12 21:01:56,946 - INFO - Epoch: 0.20 | loss: 90.249500 | grad_norm: 825.651733 | learning_rate: 0.000033 2025-04-12 21:02:09,187 - INFO - Epoch: 0.21 | loss: 92.175400 | grad_norm: 522.293030 | learning_rate: 0.000034 2025-04-12 21:02:20,666 - INFO - Epoch: 0.21 | loss: 86.844300 | grad_norm: 319.304810 | learning_rate: 0.000034 2025-04-12 21:02:32,441 - INFO - Epoch: 0.21 | loss: 87.412400 | grad_norm: 462.858002 | learning_rate: 0.000035 2025-04-12 21:02:44,684 - INFO - Epoch: 0.22 | loss: 91.242400 | grad_norm: 554.746094 | learning_rate: 0.000035 2025-04-12 21:02:56,598 - INFO - Epoch: 0.22 | loss: 86.886700 | grad_norm: 453.435211 | learning_rate: 0.000036 2025-04-12 21:03:08,732 - INFO - Epoch: 0.22 | loss: 87.758600 | grad_norm: 422.483185 | learning_rate: 0.000037 2025-04-12 21:03:20,683 - INFO - Epoch: 0.23 | loss: 94.370200 | grad_norm: 538.400574 | learning_rate: 0.000037 2025-04-12 21:03:32,621 - INFO - Epoch: 0.23 | loss: 92.584900 | grad_norm: 463.687347 | learning_rate: 0.000038 2025-04-12 21:03:44,582 - INFO - Epoch: 0.24 | loss: 89.707800 | grad_norm: 436.349640 | learning_rate: 0.000038 2025-04-12 21:03:56,481 - INFO - Epoch: 0.24 | loss: 92.473500 | grad_norm: 970.165710 | learning_rate: 0.000039 2025-04-12 21:04:08,798 - INFO - Epoch: 0.24 | loss: 89.280000 | grad_norm: 402.409729 | learning_rate: 0.000040 2025-04-12 21:04:21,263 - INFO - Epoch: 0.25 | loss: 87.742800 | grad_norm: 456.467957 | learning_rate: 0.000040 2025-04-12 21:04:33,499 - INFO - Epoch: 0.25 | loss: 88.734300 | grad_norm: 568.994873 | learning_rate: 0.000041 2025-04-12 21:04:44,951 - INFO - Epoch: 0.25 | loss: 92.930500 | grad_norm: 526.242126 | learning_rate: 0.000042 2025-04-12 21:04:56,763 - INFO - Epoch: 0.26 | loss: 87.899100 | grad_norm: 606.632996 | learning_rate: 0.000042 2025-04-12 21:05:08,509 - INFO - Epoch: 0.26 | loss: 85.687600 | grad_norm: 477.419769 | learning_rate: 0.000043 2025-04-12 21:05:20,648 - INFO - Epoch: 0.26 | loss: 91.585300 | grad_norm: 306.944092 | learning_rate: 0.000043 2025-04-12 21:05:33,045 - INFO - Epoch: 0.27 | loss: 83.816200 | grad_norm: 566.232788 | learning_rate: 0.000044 2025-04-12 21:05:45,339 - INFO - Epoch: 0.27 | loss: 91.981000 | grad_norm: 451.804688 | learning_rate: 0.000045 2025-04-12 21:05:56,822 - INFO - Epoch: 0.28 | loss: 86.052500 | grad_norm: 359.937134 | learning_rate: 0.000045 2025-04-12 21:06:08,820 - INFO - Epoch: 0.28 | loss: 90.880400 | grad_norm: 373.839233 | learning_rate: 0.000046 2025-04-12 21:06:20,113 - INFO - Epoch: 0.28 | loss: 87.244100 | grad_norm: 562.006470 | learning_rate: 0.000047 2025-04-12 21:06:32,164 - INFO - Epoch: 0.29 | loss: 93.498700 | grad_norm: 599.414429 | learning_rate: 0.000047 2025-04-12 21:06:43,825 - INFO - Epoch: 0.29 | loss: 91.034900 | grad_norm: 470.684448 | learning_rate: 0.000048 2025-04-12 21:06:55,263 - INFO - Epoch: 0.29 | loss: 85.454500 | grad_norm: 651.895325 | learning_rate: 0.000048 2025-04-12 21:07:07,513 - INFO - Epoch: 0.30 | loss: 89.160200 | grad_norm: 698.158569 | learning_rate: 0.000049 2025-04-12 21:07:18,767 - INFO - Epoch: 0.30 | loss: 86.710800 | grad_norm: 745.513306 | learning_rate: 0.000050 2025-04-12 21:07:30,442 - INFO - Epoch: 0.31 | loss: 88.666500 | grad_norm: 340.210114 | learning_rate: 0.000050 2025-04-12 21:07:42,022 - INFO - Epoch: 0.31 | loss: 79.902400 | grad_norm: 391.286591 | learning_rate: 0.000050 2025-04-12 21:07:53,328 - INFO - Epoch: 0.31 | loss: 88.168900 | grad_norm: 370.044830 | learning_rate: 0.000050 2025-04-12 21:08:04,994 - INFO - Epoch: 0.32 | loss: 85.149400 | grad_norm: 371.244843 | learning_rate: 0.000050 2025-04-12 21:08:16,590 - INFO - Epoch: 0.32 | loss: 88.459100 | grad_norm: 468.598755 | learning_rate: 0.000050 2025-04-12 21:08:27,886 - INFO - Epoch: 0.32 | loss: 90.036600 | grad_norm: 663.064148 | learning_rate: 0.000050 2025-04-12 21:08:39,106 - INFO - Epoch: 0.33 | loss: 86.968300 | grad_norm: 430.597717 | learning_rate: 0.000050 2025-04-12 21:08:50,865 - INFO - Epoch: 0.33 | loss: 88.307200 | grad_norm: 579.612732 | learning_rate: 0.000049 2025-04-12 21:09:02,555 - INFO - Epoch: 0.34 | loss: 91.052400 | grad_norm: 494.999786 | learning_rate: 0.000049 2025-04-12 21:09:14,010 - INFO - Epoch: 0.34 | loss: 87.318900 | grad_norm: 403.026642 | learning_rate: 0.000049 2025-04-12 21:09:25,592 - INFO - Epoch: 0.34 | loss: 93.755100 | grad_norm: 443.999908 | learning_rate: 0.000049 2025-04-12 21:09:37,562 - INFO - Epoch: 0.35 | loss: 93.759500 | grad_norm: 445.755524 | learning_rate: 0.000049 2025-04-12 21:09:49,357 - INFO - Epoch: 0.35 | loss: 83.695900 | grad_norm: 417.542908 | learning_rate: 0.000049 2025-04-12 21:10:00,960 - INFO - Epoch: 0.35 | loss: 88.902100 | grad_norm: 320.532318 | learning_rate: 0.000049 2025-04-12 21:10:12,360 - INFO - Epoch: 0.36 | loss: 84.243400 | grad_norm: 523.413757 | learning_rate: 0.000049 2025-04-12 21:10:24,012 - INFO - Epoch: 0.36 | loss: 88.703300 | grad_norm: 603.334656 | learning_rate: 0.000049 2025-04-12 21:10:35,509 - INFO - Epoch: 0.37 | loss: 87.650000 | grad_norm: 452.242584 | learning_rate: 0.000049 2025-04-12 21:10:47,154 - INFO - Epoch: 0.37 | loss: 94.699800 | grad_norm: 439.932953 | learning_rate: 0.000049 2025-04-12 21:10:58,510 - INFO - Epoch: 0.37 | loss: 85.117600 | grad_norm: 345.775208 | learning_rate: 0.000049 2025-04-12 21:11:10,185 - INFO - Epoch: 0.38 | loss: 87.375700 | grad_norm: 291.481659 | learning_rate: 0.000049 2025-04-12 21:11:21,903 - INFO - Epoch: 0.38 | loss: 83.949500 | grad_norm: 335.644562 | learning_rate: 0.000049 2025-04-12 21:11:33,649 - INFO - Epoch: 0.38 | loss: 86.778000 | grad_norm: 370.187408 | learning_rate: 0.000049 2025-04-12 21:11:45,174 - INFO - Epoch: 0.39 | loss: 83.662400 | grad_norm: 500.588715 | learning_rate: 0.000048 2025-04-12 21:11:56,440 - INFO - Epoch: 0.39 | loss: 85.751000 | grad_norm: 400.294678 | learning_rate: 0.000048 2025-04-12 21:12:08,361 - INFO - Epoch: 0.40 | loss: 86.103300 | grad_norm: 435.724091 | learning_rate: 0.000048 2025-04-12 21:12:19,957 - INFO - Epoch: 0.40 | loss: 85.571600 | grad_norm: 487.931793 | learning_rate: 0.000048 2025-04-12 21:12:32,066 - INFO - Epoch: 0.40 | loss: 86.598000 | grad_norm: 383.946930 | learning_rate: 0.000048 2025-04-12 21:12:43,895 - INFO - Epoch: 0.41 | loss: 82.008200 | grad_norm: 505.648712 | learning_rate: 0.000048 2025-04-12 21:12:56,621 - INFO - Epoch: 0.41 | loss: 93.589000 | grad_norm: 519.986450 | learning_rate: 0.000048 2025-04-12 21:13:08,635 - INFO - Epoch: 0.41 | loss: 89.150700 | grad_norm: 346.984406 | learning_rate: 0.000048 2025-04-12 21:13:20,200 - INFO - Epoch: 0.42 | loss: 83.485500 | grad_norm: 468.581573 | learning_rate: 0.000048 2025-04-12 21:13:31,494 - INFO - Epoch: 0.42 | loss: 83.916100 | grad_norm: 343.562744 | learning_rate: 0.000048 2025-04-12 21:13:43,292 - INFO - Epoch: 0.43 | loss: 83.251400 | grad_norm: 485.366516 | learning_rate: 0.000048 2025-04-12 21:13:55,863 - INFO - Epoch: 0.43 | loss: 82.606500 | grad_norm: 244.992050 | learning_rate: 0.000048 2025-04-12 21:14:07,514 - INFO - Epoch: 0.43 | loss: 82.304000 | grad_norm: 446.091248 | learning_rate: 0.000048 2025-04-12 21:14:19,392 - INFO - Epoch: 0.44 | loss: 83.247200 | grad_norm: 288.390167 | learning_rate: 0.000048 2025-04-12 21:14:30,970 - INFO - Epoch: 0.44 | loss: 85.684700 | grad_norm: 518.937195 | learning_rate: 0.000047 2025-04-12 21:14:42,456 - INFO - Epoch: 0.44 | loss: 82.671000 | grad_norm: 449.463654 | learning_rate: 0.000047 2025-04-12 21:14:53,954 - INFO - Epoch: 0.45 | loss: 80.860600 | grad_norm: 310.972626 | learning_rate: 0.000047 2025-04-12 21:15:06,248 - INFO - Epoch: 0.45 | loss: 86.278600 | grad_norm: 524.777954 | learning_rate: 0.000047 2025-04-12 21:15:17,419 - INFO - Epoch: 0.46 | loss: 85.075600 | grad_norm: 378.205536 | learning_rate: 0.000047 2025-04-12 21:15:29,167 - INFO - Epoch: 0.46 | loss: 83.219900 | grad_norm: 380.594971 | learning_rate: 0.000047 2025-04-12 21:15:41,051 - INFO - Epoch: 0.46 | loss: 86.794300 | grad_norm: 318.632385 | learning_rate: 0.000047 2025-04-12 21:15:52,166 - INFO - Epoch: 0.47 | loss: 86.319300 | grad_norm: 405.576569 | learning_rate: 0.000047 2025-04-12 21:16:03,058 - INFO - Epoch: 0.47 | loss: 83.391100 | grad_norm: 366.448547 | learning_rate: 0.000047 2025-04-12 21:16:14,689 - INFO - Epoch: 0.47 | loss: 87.462800 | grad_norm: 263.964783 | learning_rate: 0.000047 2025-04-12 21:16:26,315 - INFO - Epoch: 0.48 | loss: 83.135500 | grad_norm: 305.897705 | learning_rate: 0.000047 2025-04-12 21:16:38,354 - INFO - Epoch: 0.48 | loss: 89.925000 | grad_norm: 554.678101 | learning_rate: 0.000047 2025-04-12 21:16:49,456 - INFO - Epoch: 0.48 | loss: 86.487900 | grad_norm: 414.621063 | learning_rate: 0.000047 2025-04-12 21:17:01,556 - INFO - Epoch: 0.49 | loss: 86.563500 | grad_norm: 379.544739 | learning_rate: 0.000047 2025-04-12 21:17:12,886 - INFO - Epoch: 0.49 | loss: 84.295400 | grad_norm: 322.475128 | learning_rate: 0.000047 2025-04-12 21:17:25,308 - INFO - Epoch: 0.50 | loss: 93.187800 | grad_norm: 680.990479 | learning_rate: 0.000046 2025-04-12 21:17:37,033 - INFO - Epoch: 0.50 | loss: 87.496400 | grad_norm: 449.331970 | learning_rate: 0.000046 2025-04-12 21:17:48,305 - INFO - Epoch: 0.50 | loss: 80.543500 | grad_norm: 316.996155 | learning_rate: 0.000046 2025-04-12 21:17:59,260 - INFO - Epoch: 0.51 | loss: 83.683900 | grad_norm: 374.941467 | learning_rate: 0.000046 2025-04-12 21:18:11,353 - INFO - Epoch: 0.51 | loss: 83.305900 | grad_norm: 279.404968 | learning_rate: 0.000046 2025-04-12 21:18:22,856 - INFO - Epoch: 0.51 | loss: 82.135900 | grad_norm: 371.506134 | learning_rate: 0.000046 2025-04-12 21:18:34,553 - INFO - Epoch: 0.52 | loss: 86.095000 | grad_norm: 327.798492 | learning_rate: 0.000046 2025-04-12 21:18:45,980 - INFO - Epoch: 0.52 | loss: 82.440100 | grad_norm: 334.301880 | learning_rate: 0.000046 2025-04-12 21:18:57,150 - INFO - Epoch: 0.53 | loss: 83.793500 | grad_norm: 482.796539 | learning_rate: 0.000046 2025-04-12 21:19:08,837 - INFO - Epoch: 0.53 | loss: 81.878800 | grad_norm: 523.399109 | learning_rate: 0.000046 2025-04-12 21:19:20,064 - INFO - Epoch: 0.53 | loss: 85.451200 | grad_norm: 299.227112 | learning_rate: 0.000046 2025-04-12 21:19:31,855 - INFO - Epoch: 0.54 | loss: 84.145200 | grad_norm: 414.790894 | learning_rate: 0.000046 2025-04-12 21:19:43,083 - INFO - Epoch: 0.54 | loss: 82.525800 | grad_norm: 347.845001 | learning_rate: 0.000046 2025-04-12 21:19:54,492 - INFO - Epoch: 0.54 | loss: 81.667600 | grad_norm: 362.654938 | learning_rate: 0.000046 2025-04-12 21:20:05,706 - INFO - Epoch: 0.55 | loss: 86.676900 | grad_norm: 292.289856 | learning_rate: 0.000045 2025-04-12 21:20:17,379 - INFO - Epoch: 0.55 | loss: 88.034500 | grad_norm: 326.400360 | learning_rate: 0.000045 2025-04-12 21:20:29,105 - INFO - Epoch: 0.56 | loss: 85.605000 | grad_norm: 592.763733 | learning_rate: 0.000045 2025-04-12 21:20:40,429 - INFO - Epoch: 0.56 | loss: 83.187500 | grad_norm: 341.732422 | learning_rate: 0.000045 2025-04-12 21:20:51,912 - INFO - Epoch: 0.56 | loss: 85.751600 | grad_norm: 386.550507 | learning_rate: 0.000045 2025-04-12 21:21:03,488 - INFO - Epoch: 0.57 | loss: 85.755700 | grad_norm: 283.109863 | learning_rate: 0.000045 2025-04-12 21:21:15,478 - INFO - Epoch: 0.57 | loss: 86.971600 | grad_norm: 390.052734 | learning_rate: 0.000045 2025-04-12 21:21:26,949 - INFO - Epoch: 0.57 | loss: 86.629200 | grad_norm: 452.304077 | learning_rate: 0.000045 2025-04-12 21:21:38,468 - INFO - Epoch: 0.58 | loss: 89.178300 | grad_norm: 535.787170 | learning_rate: 0.000045 2025-04-12 21:21:49,923 - INFO - Epoch: 0.58 | loss: 82.287300 | grad_norm: 240.369110 | learning_rate: 0.000045 2025-04-12 21:22:01,618 - INFO - Epoch: 0.59 | loss: 90.165000 | grad_norm: 311.782776 | learning_rate: 0.000045 2025-04-12 21:22:13,148 - INFO - Epoch: 0.59 | loss: 86.320500 | grad_norm: 441.645447 | learning_rate: 0.000045 2025-04-12 21:22:24,302 - INFO - Epoch: 0.59 | loss: 84.681900 | grad_norm: 336.676483 | learning_rate: 0.000045 2025-04-12 21:22:35,711 - INFO - Epoch: 0.60 | loss: 80.817800 | grad_norm: 478.338684 | learning_rate: 0.000045 2025-04-12 21:22:46,484 - INFO - Epoch: 0.60 | loss: 79.103500 | grad_norm: 394.435028 | learning_rate: 0.000045 2025-04-12 21:22:57,698 - INFO - Epoch: 0.60 | loss: 80.048200 | grad_norm: 359.313507 | learning_rate: 0.000044 2025-04-12 21:23:09,137 - INFO - Epoch: 0.61 | loss: 85.419000 | grad_norm: 280.693787 | learning_rate: 0.000044 2025-04-12 21:23:20,814 - INFO - Epoch: 0.61 | loss: 88.021800 | grad_norm: 440.592224 | learning_rate: 0.000044 2025-04-12 21:23:32,429 - INFO - Epoch: 0.62 | loss: 84.518600 | grad_norm: 358.981873 | learning_rate: 0.000044 2025-04-12 21:23:44,856 - INFO - Epoch: 0.62 | loss: 89.179100 | grad_norm: 387.180298 | learning_rate: 0.000044 2025-04-12 21:23:56,700 - INFO - Epoch: 0.62 | loss: 88.454700 | grad_norm: 337.090759 | learning_rate: 0.000044 2025-04-12 21:24:07,784 - INFO - Epoch: 0.63 | loss: 81.382300 | grad_norm: 309.993286 | learning_rate: 0.000044 2025-04-12 21:24:19,809 - INFO - Epoch: 0.63 | loss: 86.125300 | grad_norm: 465.714600 | learning_rate: 0.000044 2025-04-12 21:24:31,129 - INFO - Epoch: 0.63 | loss: 86.706600 | grad_norm: 317.840240 | learning_rate: 0.000044 2025-04-12 21:24:42,291 - INFO - Epoch: 0.64 | loss: 82.277700 | grad_norm: 370.040527 | learning_rate: 0.000044 2025-04-12 21:24:53,962 - INFO - Epoch: 0.64 | loss: 85.439200 | grad_norm: 315.097473 | learning_rate: 0.000044 2025-04-12 21:25:05,767 - INFO - Epoch: 0.65 | loss: 82.660500 | grad_norm: 508.029022 | learning_rate: 0.000044 2025-04-12 21:25:18,279 - INFO - Epoch: 0.65 | loss: 90.989700 | grad_norm: 642.894165 | learning_rate: 0.000044 2025-04-12 21:25:29,833 - INFO - Epoch: 0.65 | loss: 86.947000 | grad_norm: 355.888306 | learning_rate: 0.000044 2025-04-12 21:25:41,118 - INFO - Epoch: 0.66 | loss: 82.582200 | grad_norm: 368.342773 | learning_rate: 0.000043 2025-04-12 21:25:52,442 - INFO - Epoch: 0.66 | loss: 86.712600 | grad_norm: 246.470413 | learning_rate: 0.000043 2025-04-12 21:26:04,237 - INFO - Epoch: 0.66 | loss: 84.766100 | grad_norm: 258.315948 | learning_rate: 0.000043 2025-04-12 21:26:15,256 - INFO - Epoch: 0.67 | loss: 79.634400 | grad_norm: 312.999359 | learning_rate: 0.000043 2025-04-12 21:26:26,896 - INFO - Epoch: 0.67 | loss: 81.031600 | grad_norm: 351.178955 | learning_rate: 0.000043 2025-04-12 21:26:38,696 - INFO - Epoch: 0.68 | loss: 83.503800 | grad_norm: 386.027771 | learning_rate: 0.000043 2025-04-12 21:26:50,189 - INFO - Epoch: 0.68 | loss: 88.228700 | grad_norm: 501.446381 | learning_rate: 0.000043 2025-04-12 21:27:01,796 - INFO - Epoch: 0.68 | loss: 81.482500 | grad_norm: 297.375549 | learning_rate: 0.000043 2025-04-12 21:27:13,593 - INFO - Epoch: 0.69 | loss: 81.968400 | grad_norm: 317.555511 | learning_rate: 0.000043 2025-04-12 21:27:25,165 - INFO - Epoch: 0.69 | loss: 79.750900 | grad_norm: 298.031586 | learning_rate: 0.000043 2025-04-12 21:27:36,792 - INFO - Epoch: 0.69 | loss: 87.442500 | grad_norm: 413.829498 | learning_rate: 0.000043 2025-04-12 21:27:48,533 - INFO - Epoch: 0.70 | loss: 89.699200 | grad_norm: 366.599792 | learning_rate: 0.000043 2025-04-12 21:28:00,689 - INFO - Epoch: 0.70 | loss: 86.404700 | grad_norm: 322.274811 | learning_rate: 0.000043 2025-04-12 21:28:12,702 - INFO - Epoch: 0.71 | loss: 87.038300 | grad_norm: 298.418152 | learning_rate: 0.000043 2025-04-12 21:28:24,148 - INFO - Epoch: 0.71 | loss: 83.801200 | grad_norm: 363.062408 | learning_rate: 0.000043 2025-04-12 21:28:36,650 - INFO - Epoch: 0.71 | loss: 88.091100 | grad_norm: 507.606903 | learning_rate: 0.000042 2025-04-12 21:28:48,777 - INFO - Epoch: 0.72 | loss: 83.284400 | grad_norm: 458.295044 | learning_rate: 0.000042 2025-04-12 21:29:01,278 - INFO - Epoch: 0.72 | loss: 84.165600 | grad_norm: 309.917389 | learning_rate: 0.000042 2025-04-12 21:29:13,638 - INFO - Epoch: 0.72 | loss: 85.086600 | grad_norm: 477.084778 | learning_rate: 0.000042 2025-04-12 21:29:25,026 - INFO - Epoch: 0.73 | loss: 85.031200 | grad_norm: 411.847565 | learning_rate: 0.000042 2025-04-12 21:29:36,258 - INFO - Epoch: 0.73 | loss: 81.799800 | grad_norm: 298.428711 | learning_rate: 0.000042 2025-04-12 21:29:48,213 - INFO - Epoch: 0.73 | loss: 86.441700 | grad_norm: 334.375732 | learning_rate: 0.000042 2025-04-12 21:29:59,806 - INFO - Epoch: 0.74 | loss: 84.081900 | grad_norm: 430.224731 | learning_rate: 0.000042 2025-04-12 21:30:11,607 - INFO - Epoch: 0.74 | loss: 88.007600 | grad_norm: 479.375427 | learning_rate: 0.000042 2025-04-12 21:30:23,203 - INFO - Epoch: 0.75 | loss: 83.245800 | grad_norm: 369.637115 | learning_rate: 0.000042 2025-04-12 21:30:35,873 - INFO - Epoch: 0.75 | loss: 92.758800 | grad_norm: 529.764038 | learning_rate: 0.000042 2025-04-12 21:30:47,462 - INFO - Epoch: 0.75 | loss: 86.687300 | grad_norm: 981.093018 | learning_rate: 0.000042 2025-04-12 21:30:58,877 - INFO - Epoch: 0.76 | loss: 76.723800 | grad_norm: 310.074799 | learning_rate: 0.000042 2025-04-12 21:31:10,435 - INFO - Epoch: 0.76 | loss: 81.391500 | grad_norm: 347.093048 | learning_rate: 0.000042 2025-04-12 21:31:21,812 - INFO - Epoch: 0.76 | loss: 85.538800 | grad_norm: 302.792358 | learning_rate: 0.000041 2025-04-12 21:31:33,754 - INFO - Epoch: 0.77 | loss: 86.219000 | grad_norm: 277.100677 | learning_rate: 0.000041 2025-04-12 21:31:45,855 - INFO - Epoch: 0.77 | loss: 86.613100 | grad_norm: 425.933105 | learning_rate: 0.000041 2025-04-12 21:31:57,678 - INFO - Epoch: 0.78 | loss: 84.868800 | grad_norm: 261.869537 | learning_rate: 0.000041 2025-04-12 21:32:09,132 - INFO - Epoch: 0.78 | loss: 83.029500 | grad_norm: 514.024109 | learning_rate: 0.000041 2025-04-12 21:32:20,748 - INFO - Epoch: 0.78 | loss: 83.691700 | grad_norm: 474.847260 | learning_rate: 0.000041 2025-04-12 21:32:32,678 - INFO - Epoch: 0.79 | loss: 86.477200 | grad_norm: 407.218536 | learning_rate: 0.000041 2025-04-12 21:32:44,330 - INFO - Epoch: 0.79 | loss: 82.192600 | grad_norm: 247.488831 | learning_rate: 0.000041 2025-04-12 21:32:56,032 - INFO - Epoch: 0.79 | loss: 85.336700 | grad_norm: 286.608185 | learning_rate: 0.000041 2025-04-12 21:33:07,975 - INFO - Epoch: 0.80 | loss: 85.948500 | grad_norm: 384.476776 | learning_rate: 0.000041 2025-04-12 21:33:19,243 - INFO - Epoch: 0.80 | loss: 81.554100 | grad_norm: 338.981323 | learning_rate: 0.000041 2025-04-12 21:33:31,049 - INFO - Epoch: 0.81 | loss: 84.436000 | grad_norm: 350.162170 | learning_rate: 0.000041 2025-04-12 21:33:42,895 - INFO - Epoch: 0.81 | loss: 86.664300 | grad_norm: 594.542175 | learning_rate: 0.000041 2025-04-12 21:33:54,251 - INFO - Epoch: 0.81 | loss: 82.979000 | grad_norm: 490.919403 | learning_rate: 0.000041 2025-04-12 21:34:06,023 - INFO - Epoch: 0.82 | loss: 90.022400 | grad_norm: 346.740601 | learning_rate: 0.000040 2025-04-12 21:34:17,112 - INFO - Epoch: 0.82 | loss: 80.897900 | grad_norm: 275.891785 | learning_rate: 0.000040 2025-04-12 21:34:28,603 - INFO - Epoch: 0.82 | loss: 85.492400 | grad_norm: 310.244751 | learning_rate: 0.000040 2025-04-12 21:34:40,438 - INFO - Epoch: 0.83 | loss: 83.657900 | grad_norm: 345.752228 | learning_rate: 0.000040 2025-04-12 21:34:51,790 - INFO - Epoch: 0.83 | loss: 83.805800 | grad_norm: 334.456879 | learning_rate: 0.000040 2025-04-12 21:35:03,395 - INFO - Epoch: 0.84 | loss: 81.418300 | grad_norm: 290.777008 | learning_rate: 0.000040 2025-04-12 21:35:14,460 - INFO - Epoch: 0.84 | loss: 80.627400 | grad_norm: 503.247253 | learning_rate: 0.000040 2025-04-12 21:35:25,692 - INFO - Epoch: 0.84 | loss: 83.325300 | grad_norm: 518.742981 | learning_rate: 0.000040 2025-04-12 21:35:37,867 - INFO - Epoch: 0.85 | loss: 84.599500 | grad_norm: 477.137909 | learning_rate: 0.000040 2025-04-12 21:35:49,386 - INFO - Epoch: 0.85 | loss: 81.855400 | grad_norm: 388.494598 | learning_rate: 0.000040 2025-04-12 21:36:01,239 - INFO - Epoch: 0.85 | loss: 84.175300 | grad_norm: 351.807678 | learning_rate: 0.000040 2025-04-12 21:36:13,279 - INFO - Epoch: 0.86 | loss: 83.888300 | grad_norm: 470.961029 | learning_rate: 0.000040 2025-04-12 21:36:25,157 - INFO - Epoch: 0.86 | loss: 83.559400 | grad_norm: 305.357086 | learning_rate: 0.000040 2025-04-12 21:36:36,269 - INFO - Epoch: 0.87 | loss: 81.678400 | grad_norm: 241.142548 | learning_rate: 0.000040 2025-04-12 21:36:47,810 - INFO - Epoch: 0.87 | loss: 84.602100 | grad_norm: 368.214478 | learning_rate: 0.000040 2025-04-12 21:36:59,827 - INFO - Epoch: 0.87 | loss: 82.214100 | grad_norm: 266.934906 | learning_rate: 0.000039 2025-04-12 21:37:11,178 - INFO - Epoch: 0.88 | loss: 77.723400 | grad_norm: 265.051270 | learning_rate: 0.000039 2025-04-12 21:37:22,919 - INFO - Epoch: 0.88 | loss: 87.469200 | grad_norm: 327.754486 | learning_rate: 0.000039 2025-04-12 21:37:34,348 - INFO - Epoch: 0.88 | loss: 78.430500 | grad_norm: 285.412537 | learning_rate: 0.000039 2025-04-12 21:37:46,273 - INFO - Epoch: 0.89 | loss: 84.798900 | grad_norm: 234.320282 | learning_rate: 0.000039 2025-04-12 21:37:57,128 - INFO - Epoch: 0.89 | loss: 82.485600 | grad_norm: 426.679871 | learning_rate: 0.000039 2025-04-12 21:38:08,887 - INFO - Epoch: 0.90 | loss: 84.873200 | grad_norm: 309.396545 | learning_rate: 0.000039 2025-04-12 21:38:20,267 - INFO - Epoch: 0.90 | loss: 81.328400 | grad_norm: 409.283295 | learning_rate: 0.000039 2025-04-12 21:38:31,803 - INFO - Epoch: 0.90 | loss: 86.752800 | grad_norm: 431.125122 | learning_rate: 0.000039 2025-04-12 21:38:43,182 - INFO - Epoch: 0.91 | loss: 85.960100 | grad_norm: 515.805298 | learning_rate: 0.000039 2025-04-12 21:38:55,030 - INFO - Epoch: 0.91 | loss: 82.244300 | grad_norm: 388.362579 | learning_rate: 0.000039 2025-04-12 21:39:06,904 - INFO - Epoch: 0.91 | loss: 82.821000 | grad_norm: 322.640991 | learning_rate: 0.000039 2025-04-12 21:39:19,431 - INFO - Epoch: 0.92 | loss: 90.290200 | grad_norm: 470.673157 | learning_rate: 0.000039 2025-04-12 21:39:30,475 - INFO - Epoch: 0.92 | loss: 78.494000 | grad_norm: 457.619019 | learning_rate: 0.000039 2025-04-12 21:39:42,075 - INFO - Epoch: 0.93 | loss: 83.897800 | grad_norm: 284.556244 | learning_rate: 0.000038 2025-04-12 21:39:53,298 - INFO - Epoch: 0.93 | loss: 78.107000 | grad_norm: 337.660614 | learning_rate: 0.000038 2025-04-12 21:40:04,865 - INFO - Epoch: 0.93 | loss: 84.834200 | grad_norm: 232.437805 | learning_rate: 0.000038 2025-04-12 21:40:16,443 - INFO - Epoch: 0.94 | loss: 83.345500 | grad_norm: 401.678680 | learning_rate: 0.000038 2025-04-12 21:40:28,191 - INFO - Epoch: 0.94 | loss: 82.176000 | grad_norm: 317.066681 | learning_rate: 0.000038 2025-04-12 21:40:40,428 - INFO - Epoch: 0.94 | loss: 90.008700 | grad_norm: 326.353638 | learning_rate: 0.000038 2025-04-12 21:40:52,088 - INFO - Epoch: 0.95 | loss: 84.076900 | grad_norm: 418.571014 | learning_rate: 0.000038 2025-04-12 21:41:04,405 - INFO - Epoch: 0.95 | loss: 83.836400 | grad_norm: 258.427368 | learning_rate: 0.000038 2025-04-12 21:41:16,227 - INFO - Epoch: 0.95 | loss: 84.708200 | grad_norm: 378.643616 | learning_rate: 0.000038 2025-04-12 21:41:28,618 - INFO - Epoch: 0.96 | loss: 88.368900 | grad_norm: 341.687622 | learning_rate: 0.000038 2025-04-12 21:41:40,130 - INFO - Epoch: 0.96 | loss: 84.150000 | grad_norm: 325.708557 | learning_rate: 0.000038 2025-04-12 21:41:51,563 - INFO - Epoch: 0.97 | loss: 80.679500 | grad_norm: 301.465851 | learning_rate: 0.000038 2025-04-12 21:42:03,284 - INFO - Epoch: 0.97 | loss: 77.728800 | grad_norm: 235.403198 | learning_rate: 0.000038 2025-04-12 21:42:14,871 - INFO - Epoch: 0.97 | loss: 83.717400 | grad_norm: 370.915497 | learning_rate: 0.000038 2025-04-12 21:42:26,733 - INFO - Epoch: 0.98 | loss: 87.073900 | grad_norm: 297.008270 | learning_rate: 0.000038 2025-04-12 21:42:37,945 - INFO - Epoch: 0.98 | loss: 79.630000 | grad_norm: 352.678040 | learning_rate: 0.000037 2025-04-12 21:42:49,265 - INFO - Epoch: 0.98 | loss: 81.864000 | grad_norm: 333.644257 | learning_rate: 0.000037 2025-04-12 21:43:00,472 - INFO - Epoch: 0.99 | loss: 84.977200 | grad_norm: 862.600830 | learning_rate: 0.000037 2025-04-12 21:43:12,529 - INFO - Epoch: 0.99 | loss: 86.738000 | grad_norm: 556.848328 | learning_rate: 0.000037 2025-04-12 21:43:24,497 - INFO - Epoch: 1.00 | loss: 85.336300 | grad_norm: 429.007904 | learning_rate: 0.000037 2025-04-12 21:43:35,947 - INFO - Epoch: 1.00 | loss: 82.306600 | grad_norm: 559.138123 | learning_rate: 0.000037 2025-04-12 21:43:52,522 - INFO - Epoch: 1.00 | loss: 90.190300 | grad_norm: 730.933838 | learning_rate: 0.000037 2025-04-12 21:44:04,355 - INFO - Epoch: 1.01 | loss: 82.465900 | grad_norm: 265.230652 | learning_rate: 0.000037 2025-04-12 21:44:16,229 - INFO - Epoch: 1.01 | loss: 81.129700 | grad_norm: 247.438614 | learning_rate: 0.000037 2025-04-12 21:44:27,202 - INFO - Epoch: 1.01 | loss: 80.839600 | grad_norm: 351.019531 | learning_rate: 0.000037 2025-04-12 21:44:39,403 - INFO - Epoch: 1.02 | loss: 81.782000 | grad_norm: 503.970490 | learning_rate: 0.000037 2025-04-12 21:44:51,128 - INFO - Epoch: 1.02 | loss: 89.889800 | grad_norm: 344.248688 | learning_rate: 0.000037 2025-04-12 21:45:02,677 - INFO - Epoch: 1.03 | loss: 85.911500 | grad_norm: 271.477142 | learning_rate: 0.000037 2025-04-12 21:45:13,970 - INFO - Epoch: 1.03 | loss: 80.687800 | grad_norm: 253.685837 | learning_rate: 0.000037 2025-04-12 21:45:25,595 - INFO - Epoch: 1.03 | loss: 80.872700 | grad_norm: 338.018555 | learning_rate: 0.000036 2025-04-12 21:45:36,540 - INFO - Epoch: 1.04 | loss: 81.014700 | grad_norm: 285.903442 | learning_rate: 0.000036 2025-04-12 21:45:48,594 - INFO - Epoch: 1.04 | loss: 81.877500 | grad_norm: 335.563232 | learning_rate: 0.000036 2025-04-12 21:45:59,710 - INFO - Epoch: 1.04 | loss: 80.584900 | grad_norm: 215.241837 | learning_rate: 0.000036 2025-04-12 21:46:11,290 - INFO - Epoch: 1.05 | loss: 79.658400 | grad_norm: 301.758698 | learning_rate: 0.000036 2025-04-12 21:46:22,642 - INFO - Epoch: 1.05 | loss: 84.976500 | grad_norm: 520.115601 | learning_rate: 0.000036 2025-04-12 21:46:34,083 - INFO - Epoch: 1.06 | loss: 75.995900 | grad_norm: 286.590607 | learning_rate: 0.000036 2025-04-12 21:46:45,205 - INFO - Epoch: 1.06 | loss: 78.403200 | grad_norm: 248.347702 | learning_rate: 0.000036 2025-04-12 21:46:56,415 - INFO - Epoch: 1.06 | loss: 82.212800 | grad_norm: 242.351532 | learning_rate: 0.000036 2025-04-12 21:47:07,998 - INFO - Epoch: 1.07 | loss: 84.298000 | grad_norm: 351.903320 | learning_rate: 0.000036 2025-04-12 21:47:19,303 - INFO - Epoch: 1.07 | loss: 80.216400 | grad_norm: 334.192993 | learning_rate: 0.000036 2025-04-12 21:47:30,696 - INFO - Epoch: 1.07 | loss: 86.069400 | grad_norm: 318.730377 | learning_rate: 0.000036 2025-04-12 21:47:42,736 - INFO - Epoch: 1.08 | loss: 88.962200 | grad_norm: 484.591492 | learning_rate: 0.000036 2025-04-12 21:47:54,629 - INFO - Epoch: 1.08 | loss: 84.115700 | grad_norm: 360.286377 | learning_rate: 0.000036 2025-04-12 21:48:05,990 - INFO - Epoch: 1.09 | loss: 82.459600 | grad_norm: 259.918976 | learning_rate: 0.000036 2025-04-12 21:48:17,800 - INFO - Epoch: 1.09 | loss: 79.247600 | grad_norm: 223.474365 | learning_rate: 0.000035 2025-04-12 21:48:29,296 - INFO - Epoch: 1.09 | loss: 80.719500 | grad_norm: 258.636261 | learning_rate: 0.000035 2025-04-12 21:48:40,987 - INFO - Epoch: 1.10 | loss: 81.552600 | grad_norm: 729.318970 | learning_rate: 0.000035 2025-04-12 21:48:52,160 - INFO - Epoch: 1.10 | loss: 80.067600 | grad_norm: 278.761902 | learning_rate: 0.000035 2025-04-12 21:49:03,832 - INFO - Epoch: 1.10 | loss: 83.329000 | grad_norm: 257.354004 | learning_rate: 0.000035 2025-04-12 21:49:15,711 - INFO - Epoch: 1.11 | loss: 83.982200 | grad_norm: 378.195251 | learning_rate: 0.000035 2025-04-12 21:49:27,843 - INFO - Epoch: 1.11 | loss: 82.396400 | grad_norm: 430.658630 | learning_rate: 0.000035 2025-04-12 21:49:40,198 - INFO - Epoch: 1.12 | loss: 86.062400 | grad_norm: 405.381378 | learning_rate: 0.000035 2025-04-12 21:49:51,638 - INFO - Epoch: 1.12 | loss: 78.149800 | grad_norm: 371.806976 | learning_rate: 0.000035 2025-04-12 21:50:03,304 - INFO - Epoch: 1.12 | loss: 83.149800 | grad_norm: 393.441437 | learning_rate: 0.000035 2025-04-12 21:50:14,524 - INFO - Epoch: 1.13 | loss: 79.609600 | grad_norm: 214.252640 | learning_rate: 0.000035 2025-04-12 21:50:25,479 - INFO - Epoch: 1.13 | loss: 82.575900 | grad_norm: 299.358154 | learning_rate: 0.000035 2025-04-12 21:50:37,161 - INFO - Epoch: 1.13 | loss: 82.774300 | grad_norm: 386.437408 | learning_rate: 0.000035 2025-04-12 21:50:48,187 - INFO - Epoch: 1.14 | loss: 79.264900 | grad_norm: 496.330017 | learning_rate: 0.000035 2025-04-12 21:50:59,388 - INFO - Epoch: 1.14 | loss: 84.739200 | grad_norm: 328.490875 | learning_rate: 0.000034 2025-04-12 21:51:10,803 - INFO - Epoch: 1.15 | loss: 83.758000 | grad_norm: 322.535645 | learning_rate: 0.000034 2025-04-12 21:51:22,488 - INFO - Epoch: 1.15 | loss: 82.036600 | grad_norm: 416.086334 | learning_rate: 0.000034 2025-04-12 21:51:34,358 - INFO - Epoch: 1.15 | loss: 81.466100 | grad_norm: 231.580948 | learning_rate: 0.000034 2025-04-12 21:51:45,849 - INFO - Epoch: 1.16 | loss: 83.403000 | grad_norm: 198.433578 | learning_rate: 0.000034 2025-04-12 21:51:57,583 - INFO - Epoch: 1.16 | loss: 84.402900 | grad_norm: 285.468201 | learning_rate: 0.000034 2025-04-12 21:52:09,138 - INFO - Epoch: 1.16 | loss: 81.564900 | grad_norm: 259.049591 | learning_rate: 0.000034 2025-04-12 21:52:20,979 - INFO - Epoch: 1.17 | loss: 86.162400 | grad_norm: 347.430359 | learning_rate: 0.000034 2025-04-12 21:52:32,764 - INFO - Epoch: 1.17 | loss: 84.065400 | grad_norm: 277.158905 | learning_rate: 0.000034 2025-04-12 21:52:43,691 - INFO - Epoch: 1.18 | loss: 77.437400 | grad_norm: 287.936218 | learning_rate: 0.000034 2025-04-12 21:52:55,310 - INFO - Epoch: 1.18 | loss: 86.889300 | grad_norm: 318.266998 | learning_rate: 0.000034 2025-04-12 21:53:06,958 - INFO - Epoch: 1.18 | loss: 79.956800 | grad_norm: 430.646881 | learning_rate: 0.000034 2025-04-12 21:53:18,472 - INFO - Epoch: 1.19 | loss: 85.218600 | grad_norm: 525.758850 | learning_rate: 0.000034 2025-04-12 21:53:30,597 - INFO - Epoch: 1.19 | loss: 80.242900 | grad_norm: 199.853912 | learning_rate: 0.000034 2025-04-12 21:53:42,981 - INFO - Epoch: 1.19 | loss: 88.169900 | grad_norm: 652.383667 | learning_rate: 0.000034 2025-04-12 21:53:54,387 - INFO - Epoch: 1.20 | loss: 81.835400 | grad_norm: 458.094147 | learning_rate: 0.000033 2025-04-12 21:54:05,818 - INFO - Epoch: 1.20 | loss: 83.504200 | grad_norm: 248.806412 | learning_rate: 0.000033 2025-04-12 21:54:17,502 - INFO - Epoch: 1.21 | loss: 81.986500 | grad_norm: 313.745544 | learning_rate: 0.000033 2025-04-12 21:54:29,404 - INFO - Epoch: 1.21 | loss: 79.707200 | grad_norm: 307.604279 | learning_rate: 0.000033 2025-04-12 21:54:40,840 - INFO - Epoch: 1.21 | loss: 80.836900 | grad_norm: 336.284149 | learning_rate: 0.000033 2025-04-12 21:54:52,436 - INFO - Epoch: 1.22 | loss: 80.102400 | grad_norm: 360.450684 | learning_rate: 0.000033 2025-04-12 21:55:03,746 - INFO - Epoch: 1.22 | loss: 81.086400 | grad_norm: 318.781464 | learning_rate: 0.000033 2025-04-12 21:55:15,259 - INFO - Epoch: 1.22 | loss: 79.511900 | grad_norm: 307.130066 | learning_rate: 0.000033 2025-04-12 21:55:26,915 - INFO - Epoch: 1.23 | loss: 84.268900 | grad_norm: 640.442078 | learning_rate: 0.000033 2025-04-12 21:55:38,099 - INFO - Epoch: 1.23 | loss: 79.535900 | grad_norm: 763.581055 | learning_rate: 0.000033 2025-04-12 21:55:49,160 - INFO - Epoch: 1.24 | loss: 77.129400 | grad_norm: 306.974762 | learning_rate: 0.000033 2025-04-12 21:56:00,482 - INFO - Epoch: 1.24 | loss: 82.654500 | grad_norm: 233.160431 | learning_rate: 0.000033 2025-04-12 21:56:12,106 - INFO - Epoch: 1.24 | loss: 86.568900 | grad_norm: 340.776367 | learning_rate: 0.000033 2025-04-12 21:56:24,551 - INFO - Epoch: 1.25 | loss: 83.600000 | grad_norm: 393.064178 | learning_rate: 0.000033 2025-04-12 21:56:37,073 - INFO - Epoch: 1.25 | loss: 86.519900 | grad_norm: 380.783295 | learning_rate: 0.000032 2025-04-12 21:56:48,912 - INFO - Epoch: 1.25 | loss: 79.976200 | grad_norm: 305.309418 | learning_rate: 0.000032 2025-04-12 21:57:00,799 - INFO - Epoch: 1.26 | loss: 85.480400 | grad_norm: 311.907562 | learning_rate: 0.000032 2025-04-12 21:57:12,115 - INFO - Epoch: 1.26 | loss: 85.278700 | grad_norm: 702.885742 | learning_rate: 0.000032 2025-04-12 21:57:23,578 - INFO - Epoch: 1.26 | loss: 82.834300 | grad_norm: 324.077332 | learning_rate: 0.000032 2025-04-12 21:57:35,577 - INFO - Epoch: 1.27 | loss: 84.391000 | grad_norm: 369.678711 | learning_rate: 0.000032 2025-04-12 21:57:47,032 - INFO - Epoch: 1.27 | loss: 84.343600 | grad_norm: 490.179199 | learning_rate: 0.000032 2025-04-12 21:57:59,096 - INFO - Epoch: 1.28 | loss: 81.964700 | grad_norm: 304.366699 | learning_rate: 0.000032 2025-04-12 21:58:11,237 - INFO - Epoch: 1.28 | loss: 88.414400 | grad_norm: 495.350372 | learning_rate: 0.000032 2025-04-12 21:58:22,469 - INFO - Epoch: 1.28 | loss: 78.641000 | grad_norm: 318.310516 | learning_rate: 0.000032 2025-04-12 21:58:33,812 - INFO - Epoch: 1.29 | loss: 81.088700 | grad_norm: 260.545685 | learning_rate: 0.000032 2025-04-12 21:58:45,459 - INFO - Epoch: 1.29 | loss: 82.391300 | grad_norm: 457.306000 | learning_rate: 0.000032 2025-04-12 21:58:57,312 - INFO - Epoch: 1.29 | loss: 86.689300 | grad_norm: 332.737518 | learning_rate: 0.000032 2025-04-12 21:59:08,997 - INFO - Epoch: 1.30 | loss: 83.663600 | grad_norm: 296.600189 | learning_rate: 0.000032 2025-04-12 21:59:20,844 - INFO - Epoch: 1.30 | loss: 78.435600 | grad_norm: 326.376953 | learning_rate: 0.000032 2025-04-12 21:59:31,825 - INFO - Epoch: 1.31 | loss: 80.387400 | grad_norm: 394.076385 | learning_rate: 0.000031 2025-04-12 21:59:43,701 - INFO - Epoch: 1.31 | loss: 89.739900 | grad_norm: 238.543655 | learning_rate: 0.000031 2025-04-12 21:59:55,247 - INFO - Epoch: 1.31 | loss: 80.197400 | grad_norm: 333.816040 | learning_rate: 0.000031 2025-04-12 22:00:06,469 - INFO - Epoch: 1.32 | loss: 84.003200 | grad_norm: 362.750977 | learning_rate: 0.000031 2025-04-12 22:00:17,780 - INFO - Epoch: 1.32 | loss: 83.782800 | grad_norm: 220.800400 | learning_rate: 0.000031 2025-04-12 22:00:29,300 - INFO - Epoch: 1.32 | loss: 82.108900 | grad_norm: 299.845856 | learning_rate: 0.000031 2025-04-12 22:00:41,218 - INFO - Epoch: 1.33 | loss: 84.360800 | grad_norm: 346.935883 | learning_rate: 0.000031 2025-04-12 22:00:52,001 - INFO - Epoch: 1.33 | loss: 80.411400 | grad_norm: 306.183533 | learning_rate: 0.000031 2025-04-12 22:01:03,660 - INFO - Epoch: 1.34 | loss: 83.296600 | grad_norm: 389.489105 | learning_rate: 0.000031 2025-04-12 22:01:15,825 - INFO - Epoch: 1.34 | loss: 80.172300 | grad_norm: 309.146667 | learning_rate: 0.000031 2025-04-12 22:01:27,817 - INFO - Epoch: 1.34 | loss: 84.303800 | grad_norm: 282.624054 | learning_rate: 0.000031 2025-04-12 22:01:39,566 - INFO - Epoch: 1.35 | loss: 80.846400 | grad_norm: 277.360779 | learning_rate: 0.000031 2025-04-12 22:01:51,329 - INFO - Epoch: 1.35 | loss: 84.483100 | grad_norm: 307.058807 | learning_rate: 0.000031 2025-04-12 22:02:03,903 - INFO - Epoch: 1.35 | loss: 83.076000 | grad_norm: 282.101105 | learning_rate: 0.000031 2025-04-12 22:02:15,769 - INFO - Epoch: 1.36 | loss: 81.191200 | grad_norm: 216.583481 | learning_rate: 0.000030 2025-04-12 22:02:27,600 - INFO - Epoch: 1.36 | loss: 78.329200 | grad_norm: 236.970398 | learning_rate: 0.000030 2025-04-12 22:02:38,660 - INFO - Epoch: 1.37 | loss: 80.446000 | grad_norm: 279.638885 | learning_rate: 0.000030 2025-04-12 22:02:50,016 - INFO - Epoch: 1.37 | loss: 79.855200 | grad_norm: 448.925476 | learning_rate: 0.000030 2025-04-12 22:03:02,487 - INFO - Epoch: 1.37 | loss: 84.647400 | grad_norm: 284.413300 | learning_rate: 0.000030 2025-04-12 22:03:13,850 - INFO - Epoch: 1.38 | loss: 83.132300 | grad_norm: 437.404755 | learning_rate: 0.000030 2025-04-12 22:03:25,172 - INFO - Epoch: 1.38 | loss: 78.209500 | grad_norm: 787.231995 | learning_rate: 0.000030 2025-04-12 22:03:37,169 - INFO - Epoch: 1.38 | loss: 85.233400 | grad_norm: 226.830521 | learning_rate: 0.000030 2025-04-12 22:03:48,948 - INFO - Epoch: 1.39 | loss: 81.509700 | grad_norm: 283.424194 | learning_rate: 0.000030 2025-04-12 22:04:00,162 - INFO - Epoch: 1.39 | loss: 76.252700 | grad_norm: 292.114899 | learning_rate: 0.000030 2025-04-12 22:04:11,032 - INFO - Epoch: 1.40 | loss: 80.701400 | grad_norm: 302.084595 | learning_rate: 0.000030 2025-04-12 22:04:22,764 - INFO - Epoch: 1.40 | loss: 81.565800 | grad_norm: 269.066406 | learning_rate: 0.000030 2025-04-12 22:04:34,214 - INFO - Epoch: 1.40 | loss: 83.252800 | grad_norm: 728.813904 | learning_rate: 0.000030 2025-04-12 22:04:46,979 - INFO - Epoch: 1.41 | loss: 86.984600 | grad_norm: 415.766205 | learning_rate: 0.000030 2025-04-12 22:04:58,488 - INFO - Epoch: 1.41 | loss: 82.570300 | grad_norm: 250.676651 | learning_rate: 0.000030 2025-04-12 22:05:10,610 - INFO - Epoch: 1.41 | loss: 81.412700 | grad_norm: 350.000031 | learning_rate: 0.000029 2025-04-12 22:05:22,899 - INFO - Epoch: 1.42 | loss: 86.120400 | grad_norm: 518.639648 | learning_rate: 0.000029 2025-04-12 22:05:34,633 - INFO - Epoch: 1.42 | loss: 85.862400 | grad_norm: 260.400604 | learning_rate: 0.000029 2025-04-12 22:05:46,185 - INFO - Epoch: 1.43 | loss: 80.644000 | grad_norm: 219.104065 | learning_rate: 0.000029 2025-04-12 22:05:57,963 - INFO - Epoch: 1.43 | loss: 86.585200 | grad_norm: 281.342896 | learning_rate: 0.000029 2025-04-12 22:06:09,407 - INFO - Epoch: 1.43 | loss: 80.059800 | grad_norm: 383.030365 | learning_rate: 0.000029 2025-04-12 22:06:21,450 - INFO - Epoch: 1.44 | loss: 84.026600 | grad_norm: 311.790497 | learning_rate: 0.000029 2025-04-12 22:06:32,302 - INFO - Epoch: 1.44 | loss: 78.714400 | grad_norm: 281.770447 | learning_rate: 0.000029 2025-04-12 22:06:44,612 - INFO - Epoch: 1.44 | loss: 89.372800 | grad_norm: 1034.787109 | learning_rate: 0.000029 2025-04-12 22:06:56,364 - INFO - Epoch: 1.45 | loss: 87.025600 | grad_norm: 306.950989 | learning_rate: 0.000029 2025-04-12 22:07:08,767 - INFO - Epoch: 1.45 | loss: 78.167600 | grad_norm: 332.424133 | learning_rate: 0.000029 2025-04-12 22:07:20,175 - INFO - Epoch: 1.46 | loss: 80.434200 | grad_norm: 331.170013 | learning_rate: 0.000029 2025-04-12 22:07:32,168 - INFO - Epoch: 1.46 | loss: 82.180200 | grad_norm: 211.902542 | learning_rate: 0.000029 2025-04-12 22:07:44,026 - INFO - Epoch: 1.46 | loss: 77.799100 | grad_norm: 318.954468 | learning_rate: 0.000029 2025-04-12 22:07:56,009 - INFO - Epoch: 1.47 | loss: 82.798800 | grad_norm: 262.167114 | learning_rate: 0.000028 2025-04-12 22:08:07,708 - INFO - Epoch: 1.47 | loss: 80.865900 | grad_norm: 349.556213 | learning_rate: 0.000028 2025-04-12 22:08:18,353 - INFO - Epoch: 1.47 | loss: 76.111000 | grad_norm: 217.921494 | learning_rate: 0.000028 2025-04-12 22:08:30,134 - INFO - Epoch: 1.48 | loss: 83.639000 | grad_norm: 296.379822 | learning_rate: 0.000028 2025-04-12 22:08:42,279 - INFO - Epoch: 1.48 | loss: 82.621900 | grad_norm: 725.332520 | learning_rate: 0.000028 2025-04-12 22:08:53,750 - INFO - Epoch: 1.48 | loss: 81.276700 | grad_norm: 239.138596 | learning_rate: 0.000028 2025-04-12 22:09:04,925 - INFO - Epoch: 1.49 | loss: 79.902900 | grad_norm: 231.225555 | learning_rate: 0.000028 2025-04-12 22:09:16,206 - INFO - Epoch: 1.49 | loss: 79.021400 | grad_norm: 271.495209 | learning_rate: 0.000028 2025-04-12 22:09:27,440 - INFO - Epoch: 1.50 | loss: 78.841800 | grad_norm: 273.877869 | learning_rate: 0.000028 2025-04-12 22:09:38,294 - INFO - Epoch: 1.50 | loss: 75.256700 | grad_norm: 171.827988 | learning_rate: 0.000028 2025-04-12 22:09:50,271 - INFO - Epoch: 1.50 | loss: 75.596100 | grad_norm: 286.052124 | learning_rate: 0.000028 2025-04-12 22:10:02,324 - INFO - Epoch: 1.51 | loss: 81.644100 | grad_norm: 245.255661 | learning_rate: 0.000028 2025-04-12 22:10:14,544 - INFO - Epoch: 1.51 | loss: 85.978200 | grad_norm: 245.149200 | learning_rate: 0.000028 2025-04-12 22:10:26,625 - INFO - Epoch: 1.51 | loss: 86.382600 | grad_norm: 377.311737 | learning_rate: 0.000028 2025-04-12 22:10:38,822 - INFO - Epoch: 1.52 | loss: 84.352000 | grad_norm: 836.619446 | learning_rate: 0.000028 2025-04-12 22:10:51,140 - INFO - Epoch: 1.52 | loss: 84.978300 | grad_norm: 251.558655 | learning_rate: 0.000027 2025-04-12 22:11:02,514 - INFO - Epoch: 1.53 | loss: 82.105300 | grad_norm: 458.828278 | learning_rate: 0.000027 2025-04-12 22:11:14,257 - INFO - Epoch: 1.53 | loss: 81.907400 | grad_norm: 434.976349 | learning_rate: 0.000027 2025-04-12 22:11:26,078 - INFO - Epoch: 1.53 | loss: 85.828900 | grad_norm: 349.730621 | learning_rate: 0.000027 2025-04-12 22:11:37,626 - INFO - Epoch: 1.54 | loss: 82.394400 | grad_norm: 359.090271 | learning_rate: 0.000027 2025-04-12 22:11:49,851 - INFO - Epoch: 1.54 | loss: 79.850000 | grad_norm: 302.313721 | learning_rate: 0.000027 2025-04-12 22:12:02,031 - INFO - Epoch: 1.54 | loss: 80.423600 | grad_norm: 713.922363 | learning_rate: 0.000027 2025-04-12 22:12:13,549 - INFO - Epoch: 1.55 | loss: 86.737200 | grad_norm: 217.130692 | learning_rate: 0.000027 2025-04-12 22:12:25,320 - INFO - Epoch: 1.55 | loss: 83.598400 | grad_norm: 302.711884 | learning_rate: 0.000027 2025-04-12 22:12:36,256 - INFO - Epoch: 1.56 | loss: 80.798300 | grad_norm: 477.607697 | learning_rate: 0.000027 2025-04-12 22:12:47,335 - INFO - Epoch: 1.56 | loss: 81.705900 | grad_norm: 249.180405 | learning_rate: 0.000027 2025-04-12 22:12:58,523 - INFO - Epoch: 1.56 | loss: 72.331800 | grad_norm: 262.168030 | learning_rate: 0.000027 2025-04-12 22:13:10,477 - INFO - Epoch: 1.57 | loss: 81.434200 | grad_norm: 403.022186 | learning_rate: 0.000027 2025-04-12 22:13:22,022 - INFO - Epoch: 1.57 | loss: 82.157300 | grad_norm: 205.661087 | learning_rate: 0.000027 2025-04-12 22:13:33,601 - INFO - Epoch: 1.57 | loss: 81.032800 | grad_norm: 420.776947 | learning_rate: 0.000026 2025-04-12 22:13:45,437 - INFO - Epoch: 1.58 | loss: 78.322700 | grad_norm: 307.115021 | learning_rate: 0.000026 2025-04-12 22:13:57,294 - INFO - Epoch: 1.58 | loss: 85.554300 | grad_norm: 523.468079 | learning_rate: 0.000026 2025-04-12 22:14:09,316 - INFO - Epoch: 1.59 | loss: 79.648100 | grad_norm: 300.695190 | learning_rate: 0.000026 2025-04-12 22:14:21,288 - INFO - Epoch: 1.59 | loss: 83.071000 | grad_norm: 431.046570 | learning_rate: 0.000026 2025-04-12 22:14:33,137 - INFO - Epoch: 1.59 | loss: 79.711800 | grad_norm: 326.665070 | learning_rate: 0.000026 2025-04-12 22:14:45,441 - INFO - Epoch: 1.60 | loss: 86.946400 | grad_norm: 420.309631 | learning_rate: 0.000026 2025-04-12 22:14:57,332 - INFO - Epoch: 1.60 | loss: 81.118200 | grad_norm: 435.693787 | learning_rate: 0.000026 2025-04-12 22:15:08,687 - INFO - Epoch: 1.60 | loss: 77.904900 | grad_norm: 593.538086 | learning_rate: 0.000026 2025-04-12 22:15:20,410 - INFO - Epoch: 1.61 | loss: 86.440400 | grad_norm: 620.757263 | learning_rate: 0.000026 2025-04-12 22:15:31,969 - INFO - Epoch: 1.61 | loss: 83.733400 | grad_norm: 263.525818 | learning_rate: 0.000026 2025-04-12 22:15:44,043 - INFO - Epoch: 1.62 | loss: 80.290800 | grad_norm: 430.827362 | learning_rate: 0.000026 2025-04-12 22:15:55,115 - INFO - Epoch: 1.62 | loss: 82.474100 | grad_norm: 1192.600220 | learning_rate: 0.000026 2025-04-12 22:16:07,492 - INFO - Epoch: 1.62 | loss: 85.235500 | grad_norm: 295.524872 | learning_rate: 0.000026 2025-04-12 22:16:19,108 - INFO - Epoch: 1.63 | loss: 80.126800 | grad_norm: 491.541107 | learning_rate: 0.000026 2025-04-12 22:16:31,172 - INFO - Epoch: 1.63 | loss: 81.838800 | grad_norm: 330.442749 | learning_rate: 0.000025 2025-04-12 22:16:42,993 - INFO - Epoch: 1.63 | loss: 79.841700 | grad_norm: 303.790222 | learning_rate: 0.000025 2025-04-12 22:16:54,680 - INFO - Epoch: 1.64 | loss: 81.840300 | grad_norm: 369.374329 | learning_rate: 0.000025 2025-04-12 22:17:06,563 - INFO - Epoch: 1.64 | loss: 83.547400 | grad_norm: 499.175812 | learning_rate: 0.000025 2025-04-12 22:17:18,431 - INFO - Epoch: 1.65 | loss: 81.643700 | grad_norm: 249.754532 | learning_rate: 0.000025 2025-04-12 22:17:29,334 - INFO - Epoch: 1.65 | loss: 79.382900 | grad_norm: 260.737793 | learning_rate: 0.000025 2025-04-12 22:17:40,834 - INFO - Epoch: 1.65 | loss: 79.643200 | grad_norm: 299.254791 | learning_rate: 0.000025 2025-04-12 22:17:52,534 - INFO - Epoch: 1.66 | loss: 84.238600 | grad_norm: 545.553406 | learning_rate: 0.000025 2025-04-12 22:18:04,474 - INFO - Epoch: 1.66 | loss: 85.079200 | grad_norm: 441.908020 | learning_rate: 0.000025 2025-04-12 22:18:16,241 - INFO - Epoch: 1.66 | loss: 84.598000 | grad_norm: 248.248322 | learning_rate: 0.000025 2025-04-12 22:18:28,315 - INFO - Epoch: 1.67 | loss: 78.419800 | grad_norm: 319.802246 | learning_rate: 0.000025 2025-04-12 22:18:39,515 - INFO - Epoch: 1.67 | loss: 77.938900 | grad_norm: 253.024582 | learning_rate: 0.000025 2025-04-12 22:18:51,167 - INFO - Epoch: 1.68 | loss: 76.939400 | grad_norm: 386.196503 | learning_rate: 0.000025 2025-04-12 22:19:03,025 - INFO - Epoch: 1.68 | loss: 77.569400 | grad_norm: 337.761261 | learning_rate: 0.000025 2025-04-12 22:19:14,913 - INFO - Epoch: 1.68 | loss: 81.234200 | grad_norm: 249.479584 | learning_rate: 0.000024 2025-04-12 22:19:27,212 - INFO - Epoch: 1.69 | loss: 80.419100 | grad_norm: 250.777100 | learning_rate: 0.000024 2025-04-12 22:19:38,498 - INFO - Epoch: 1.69 | loss: 81.554400 | grad_norm: 436.116241 | learning_rate: 0.000024 2025-04-12 22:19:50,288 - INFO - Epoch: 1.69 | loss: 82.937600 | grad_norm: 210.272049 | learning_rate: 0.000024 2025-04-12 22:20:01,663 - INFO - Epoch: 1.70 | loss: 81.848300 | grad_norm: 280.299713 | learning_rate: 0.000024 2025-04-12 22:20:13,434 - INFO - Epoch: 1.70 | loss: 82.060400 | grad_norm: 323.167755 | learning_rate: 0.000024 2025-04-12 22:20:25,108 - INFO - Epoch: 1.71 | loss: 79.529100 | grad_norm: 373.270264 | learning_rate: 0.000024 2025-04-12 22:20:37,062 - INFO - Epoch: 1.71 | loss: 81.390200 | grad_norm: 1171.565308 | learning_rate: 0.000024 2025-04-12 22:20:48,625 - INFO - Epoch: 1.71 | loss: 76.605500 | grad_norm: 249.530518 | learning_rate: 0.000024 2025-04-12 22:20:59,948 - INFO - Epoch: 1.72 | loss: 75.873700 | grad_norm: 201.834381 | learning_rate: 0.000024 2025-04-12 22:21:11,100 - INFO - Epoch: 1.72 | loss: 79.038100 | grad_norm: 759.627441 | learning_rate: 0.000024 2025-04-12 22:21:22,288 - INFO - Epoch: 1.72 | loss: 83.167300 | grad_norm: 274.451630 | learning_rate: 0.000024 2025-04-12 22:21:34,216 - INFO - Epoch: 1.73 | loss: 79.209700 | grad_norm: 332.932861 | learning_rate: 0.000024 2025-04-12 22:21:45,742 - INFO - Epoch: 1.73 | loss: 81.385400 | grad_norm: 315.385193 | learning_rate: 0.000024 2025-04-12 22:21:57,065 - INFO - Epoch: 1.73 | loss: 75.205300 | grad_norm: 324.561859 | learning_rate: 0.000024 2025-04-12 22:22:08,869 - INFO - Epoch: 1.74 | loss: 83.794000 | grad_norm: 411.952698 | learning_rate: 0.000023 2025-04-12 22:22:20,099 - INFO - Epoch: 1.74 | loss: 79.431600 | grad_norm: 505.661041 | learning_rate: 0.000023 2025-04-12 22:22:31,831 - INFO - Epoch: 1.75 | loss: 82.419800 | grad_norm: 605.331482 | learning_rate: 0.000023 2025-04-12 22:22:43,596 - INFO - Epoch: 1.75 | loss: 81.197600 | grad_norm: 511.968567 | learning_rate: 0.000023 2025-04-12 22:22:55,534 - INFO - Epoch: 1.75 | loss: 78.986200 | grad_norm: 434.755310 | learning_rate: 0.000023 2025-04-12 22:23:07,619 - INFO - Epoch: 1.76 | loss: 83.738200 | grad_norm: 294.036926 | learning_rate: 0.000023 2025-04-12 22:23:19,208 - INFO - Epoch: 1.76 | loss: 80.430600 | grad_norm: 251.197311 | learning_rate: 0.000023 2025-04-12 22:23:30,391 - INFO - Epoch: 1.76 | loss: 82.067700 | grad_norm: 492.381531 | learning_rate: 0.000023 2025-04-12 22:23:42,282 - INFO - Epoch: 1.77 | loss: 78.277900 | grad_norm: 303.760468 | learning_rate: 0.000023 2025-04-12 22:23:53,684 - INFO - Epoch: 1.77 | loss: 80.993000 | grad_norm: 475.808746 | learning_rate: 0.000023 2025-04-12 22:24:05,984 - INFO - Epoch: 1.78 | loss: 84.590400 | grad_norm: 680.578308 | learning_rate: 0.000023 2025-04-12 22:24:17,251 - INFO - Epoch: 1.78 | loss: 77.980400 | grad_norm: 411.231415 | learning_rate: 0.000023 2025-04-12 22:24:29,252 - INFO - Epoch: 1.78 | loss: 80.628100 | grad_norm: 362.064880 | learning_rate: 0.000023 2025-04-12 22:24:40,654 - INFO - Epoch: 1.79 | loss: 80.163700 | grad_norm: 473.697083 | learning_rate: 0.000023 2025-04-12 22:24:52,147 - INFO - Epoch: 1.79 | loss: 80.857000 | grad_norm: 314.999084 | learning_rate: 0.000022 2025-04-12 22:25:03,651 - INFO - Epoch: 1.79 | loss: 87.183500 | grad_norm: 843.028015 | learning_rate: 0.000022 2025-04-12 22:25:15,222 - INFO - Epoch: 1.80 | loss: 78.898300 | grad_norm: 311.498505 | learning_rate: 0.000022 2025-04-12 22:25:26,512 - INFO - Epoch: 1.80 | loss: 79.797300 | grad_norm: 358.510773 | learning_rate: 0.000022 2025-04-12 22:25:37,909 - INFO - Epoch: 1.81 | loss: 81.298800 | grad_norm: 428.075317 | learning_rate: 0.000022 2025-04-12 22:25:50,060 - INFO - Epoch: 1.81 | loss: 84.605000 | grad_norm: 260.444366 | learning_rate: 0.000022 2025-04-12 22:26:01,783 - INFO - Epoch: 1.81 | loss: 78.592400 | grad_norm: 953.374207 | learning_rate: 0.000022 2025-04-12 22:26:14,062 - INFO - Epoch: 1.82 | loss: 84.070400 | grad_norm: 1094.497070 | learning_rate: 0.000022 2025-04-12 22:26:25,815 - INFO - Epoch: 1.82 | loss: 84.801100 | grad_norm: 607.472595 | learning_rate: 0.000022 2025-04-12 22:26:37,235 - INFO - Epoch: 1.82 | loss: 81.284300 | grad_norm: 1013.034546 | learning_rate: 0.000022 2025-04-12 22:26:48,610 - INFO - Epoch: 1.83 | loss: 79.718000 | grad_norm: 236.586380 | learning_rate: 0.000022 2025-04-12 22:26:59,983 - INFO - Epoch: 1.83 | loss: 77.375800 | grad_norm: 266.645386 | learning_rate: 0.000022 2025-04-12 22:27:12,064 - INFO - Epoch: 1.84 | loss: 83.270700 | grad_norm: 187.570328 | learning_rate: 0.000022 2025-04-12 22:27:23,511 - INFO - Epoch: 1.84 | loss: 80.211100 | grad_norm: 410.139648 | learning_rate: 0.000022 2025-04-12 22:27:34,740 - INFO - Epoch: 1.84 | loss: 73.778900 | grad_norm: 284.773743 | learning_rate: 0.000021 2025-04-12 22:27:46,244 - INFO - Epoch: 1.85 | loss: 79.440200 | grad_norm: 871.825500 | learning_rate: 0.000021 2025-04-12 22:27:58,171 - INFO - Epoch: 1.85 | loss: 83.601100 | grad_norm: 452.190704 | learning_rate: 0.000021 2025-04-12 22:28:09,332 - INFO - Epoch: 1.85 | loss: 80.475500 | grad_norm: 263.378601 | learning_rate: 0.000021 2025-04-12 22:28:21,088 - INFO - Epoch: 1.86 | loss: 79.903700 | grad_norm: 287.521851 | learning_rate: 0.000021 2025-04-12 22:28:33,464 - INFO - Epoch: 1.86 | loss: 82.956900 | grad_norm: 364.695343 | learning_rate: 0.000021 2025-04-12 22:28:45,039 - INFO - Epoch: 1.87 | loss: 83.172100 | grad_norm: 1372.508179 | learning_rate: 0.000021 2025-04-12 22:28:56,654 - INFO - Epoch: 1.87 | loss: 85.249300 | grad_norm: 491.302246 | learning_rate: 0.000021 2025-04-12 22:29:08,328 - INFO - Epoch: 1.87 | loss: 80.292000 | grad_norm: 628.130432 | learning_rate: 0.000021 2025-04-12 22:29:20,225 - INFO - Epoch: 1.88 | loss: 79.619300 | grad_norm: 293.454742 | learning_rate: 0.000021 2025-04-12 22:29:31,708 - INFO - Epoch: 1.88 | loss: 81.619900 | grad_norm: 536.366211 | learning_rate: 0.000021 2025-04-12 22:29:43,819 - INFO - Epoch: 1.88 | loss: 81.450300 | grad_norm: 251.767899 | learning_rate: 0.000021 2025-04-12 22:29:55,607 - INFO - Epoch: 1.89 | loss: 82.470000 | grad_norm: 335.907623 | learning_rate: 0.000021 2025-04-12 22:30:07,359 - INFO - Epoch: 1.89 | loss: 79.992500 | grad_norm: 2077.325439 | learning_rate: 0.000021 2025-04-12 22:30:18,699 - INFO - Epoch: 1.90 | loss: 80.365600 | grad_norm: 491.245270 | learning_rate: 0.000021 2025-04-12 22:30:30,212 - INFO - Epoch: 1.90 | loss: 80.468700 | grad_norm: 345.253815 | learning_rate: 0.000020 2025-04-12 22:30:41,993 - INFO - Epoch: 1.90 | loss: 79.751100 | grad_norm: 424.577728 | learning_rate: 0.000020 2025-04-12 22:30:54,112 - INFO - Epoch: 1.91 | loss: 79.061500 | grad_norm: 260.019440 | learning_rate: 0.000020 2025-04-12 22:31:05,577 - INFO - Epoch: 1.91 | loss: 76.064600 | grad_norm: 304.838562 | learning_rate: 0.000020 2025-04-12 22:31:17,406 - INFO - Epoch: 1.91 | loss: 80.182300 | grad_norm: 341.803101 | learning_rate: 0.000020 2025-04-12 22:31:28,757 - INFO - Epoch: 1.92 | loss: 76.675100 | grad_norm: 389.409119 | learning_rate: 0.000020 2025-04-12 22:31:40,659 - INFO - Epoch: 1.92 | loss: 85.726700 | grad_norm: 692.782959 | learning_rate: 0.000020 2025-04-12 22:31:52,508 - INFO - Epoch: 1.93 | loss: 78.972900 | grad_norm: 356.675293 | learning_rate: 0.000020 2025-04-12 22:32:04,234 - INFO - Epoch: 1.93 | loss: 83.206100 | grad_norm: 649.538147 | learning_rate: 0.000020 2025-04-12 22:32:15,497 - INFO - Epoch: 1.93 | loss: 79.765000 | grad_norm: 525.739197 | learning_rate: 0.000020 2025-04-12 22:32:27,169 - INFO - Epoch: 1.94 | loss: 77.867800 | grad_norm: 277.402618 | learning_rate: 0.000020 2025-04-12 22:32:37,931 - INFO - Epoch: 1.94 | loss: 73.197800 | grad_norm: 326.449188 | learning_rate: 0.000020 2025-04-12 22:32:49,895 - INFO - Epoch: 1.94 | loss: 85.006700 | grad_norm: 650.261414 | learning_rate: 0.000020 2025-04-12 22:33:01,147 - INFO - Epoch: 1.95 | loss: 79.838800 | grad_norm: 783.265320 | learning_rate: 0.000020 2025-04-12 22:33:12,776 - INFO - Epoch: 1.95 | loss: 81.850100 | grad_norm: 281.446167 | learning_rate: 0.000019 2025-04-12 22:33:24,282 - INFO - Epoch: 1.95 | loss: 82.990700 | grad_norm: 291.598206 | learning_rate: 0.000019 2025-04-12 22:33:35,996 - INFO - Epoch: 1.96 | loss: 82.907700 | grad_norm: 333.757263 | learning_rate: 0.000019 2025-04-12 22:33:48,492 - INFO - Epoch: 1.96 | loss: 82.827400 | grad_norm: 735.531982 | learning_rate: 0.000019 2025-04-12 22:34:00,253 - INFO - Epoch: 1.97 | loss: 80.987100 | grad_norm: 240.716797 | learning_rate: 0.000019 2025-04-12 22:34:12,216 - INFO - Epoch: 1.97 | loss: 87.163900 | grad_norm: 436.578369 | learning_rate: 0.000019 2025-04-12 22:34:24,119 - INFO - Epoch: 1.97 | loss: 77.605400 | grad_norm: 266.754456 | learning_rate: 0.000019 2025-04-12 22:34:35,952 - INFO - Epoch: 1.98 | loss: 78.538100 | grad_norm: 255.779144 | learning_rate: 0.000019 2025-04-12 22:34:47,776 - INFO - Epoch: 1.98 | loss: 83.544300 | grad_norm: 248.631042 | learning_rate: 0.000019 2025-04-12 22:34:59,964 - INFO - Epoch: 1.98 | loss: 87.009100 | grad_norm: 575.017883 | learning_rate: 0.000019 2025-04-12 22:35:11,188 - INFO - Epoch: 1.99 | loss: 80.401400 | grad_norm: 955.677979 | learning_rate: 0.000019 2025-04-12 22:35:22,597 - INFO - Epoch: 1.99 | loss: 78.224500 | grad_norm: 424.922272 | learning_rate: 0.000019 2025-04-12 22:35:33,809 - INFO - Epoch: 2.00 | loss: 77.894100 | grad_norm: 285.179962 | learning_rate: 0.000019 2025-04-12 22:35:45,209 - INFO - Epoch: 2.00 | loss: 80.778400 | grad_norm: 311.875824 | learning_rate: 0.000019 2025-04-12 22:36:01,318 - INFO - Epoch: 2.00 | loss: 82.413200 | grad_norm: 574.306458 | learning_rate: 0.000019 2025-04-12 22:36:13,089 - INFO - Epoch: 2.01 | loss: 81.876200 | grad_norm: 277.937042 | learning_rate: 0.000018 2025-04-12 22:36:24,802 - INFO - Epoch: 2.01 | loss: 83.375400 | grad_norm: 474.366791 | learning_rate: 0.000018 2025-04-12 22:36:36,835 - INFO - Epoch: 2.01 | loss: 81.405000 | grad_norm: 242.954453 | learning_rate: 0.000018 2025-04-12 22:36:48,343 - INFO - Epoch: 2.02 | loss: 80.681500 | grad_norm: 541.231750 | learning_rate: 0.000018 2025-04-12 22:36:59,770 - INFO - Epoch: 2.02 | loss: 77.606800 | grad_norm: 488.186646 | learning_rate: 0.000018 2025-04-12 22:37:11,267 - INFO - Epoch: 2.03 | loss: 81.315900 | grad_norm: 608.231445 | learning_rate: 0.000018 2025-04-12 22:37:22,953 - INFO - Epoch: 2.03 | loss: 85.595700 | grad_norm: 417.409119 | learning_rate: 0.000018 2025-04-12 22:37:34,498 - INFO - Epoch: 2.03 | loss: 77.709300 | grad_norm: 245.244202 | learning_rate: 0.000018 2025-04-12 22:37:46,596 - INFO - Epoch: 2.04 | loss: 80.914300 | grad_norm: 1454.127075 | learning_rate: 0.000018 2025-04-12 22:37:57,541 - INFO - Epoch: 2.04 | loss: 72.610400 | grad_norm: 503.779816 | learning_rate: 0.000018 2025-04-12 22:38:09,621 - INFO - Epoch: 2.04 | loss: 81.765800 | grad_norm: 242.689697 | learning_rate: 0.000018 2025-04-12 22:38:20,673 - INFO - Epoch: 2.05 | loss: 78.346800 | grad_norm: 310.103333 | learning_rate: 0.000018 2025-04-12 22:38:32,931 - INFO - Epoch: 2.05 | loss: 80.709000 | grad_norm: 334.466614 | learning_rate: 0.000018 2025-04-12 22:38:44,462 - INFO - Epoch: 2.06 | loss: 82.424700 | grad_norm: 328.820435 | learning_rate: 0.000018 2025-04-12 22:38:55,478 - INFO - Epoch: 2.06 | loss: 77.981400 | grad_norm: 232.899719 | learning_rate: 0.000017 2025-04-12 22:39:08,011 - INFO - Epoch: 2.06 | loss: 80.246400 | grad_norm: 278.401001 | learning_rate: 0.000017 2025-04-12 22:39:19,306 - INFO - Epoch: 2.07 | loss: 77.793300 | grad_norm: 228.352692 | learning_rate: 0.000017 2025-04-12 22:39:30,850 - INFO - Epoch: 2.07 | loss: 79.737600 | grad_norm: 277.117676 | learning_rate: 0.000017 2025-04-12 22:39:42,299 - INFO - Epoch: 2.07 | loss: 74.384600 | grad_norm: 316.711212 | learning_rate: 0.000017 2025-04-12 22:39:54,482 - INFO - Epoch: 2.08 | loss: 81.981700 | grad_norm: 265.351471 | learning_rate: 0.000017 2025-04-12 22:40:05,894 - INFO - Epoch: 2.08 | loss: 80.616600 | grad_norm: 205.606293 | learning_rate: 0.000017 2025-04-12 22:40:17,130 - INFO - Epoch: 2.09 | loss: 77.019600 | grad_norm: 244.675140 | learning_rate: 0.000017 2025-04-12 22:40:28,862 - INFO - Epoch: 2.09 | loss: 76.841900 | grad_norm: 667.250671 | learning_rate: 0.000017 2025-04-12 22:40:40,251 - INFO - Epoch: 2.09 | loss: 80.241500 | grad_norm: 643.415405 | learning_rate: 0.000017 2025-04-12 22:40:51,875 - INFO - Epoch: 2.10 | loss: 81.646500 | grad_norm: 319.125122 | learning_rate: 0.000017 2025-04-12 22:41:03,423 - INFO - Epoch: 2.10 | loss: 79.379400 | grad_norm: 752.170044 | learning_rate: 0.000017 2025-04-12 22:41:14,717 - INFO - Epoch: 2.10 | loss: 76.136300 | grad_norm: 499.967651 | learning_rate: 0.000017 2025-04-12 22:41:26,147 - INFO - Epoch: 2.11 | loss: 78.087800 | grad_norm: 452.229340 | learning_rate: 0.000017 2025-04-12 22:41:37,943 - INFO - Epoch: 2.11 | loss: 79.199300 | grad_norm: 455.409973 | learning_rate: 0.000017 2025-04-12 22:41:49,291 - INFO - Epoch: 2.12 | loss: 80.609800 | grad_norm: 421.255737 | learning_rate: 0.000016 2025-04-12 22:42:00,713 - INFO - Epoch: 2.12 | loss: 79.795300 | grad_norm: 575.476685 | learning_rate: 0.000016 2025-04-12 22:42:12,766 - INFO - Epoch: 2.12 | loss: 83.062600 | grad_norm: 826.071594 | learning_rate: 0.000016 2025-04-12 22:42:24,497 - INFO - Epoch: 2.13 | loss: 84.085500 | grad_norm: 825.296509 | learning_rate: 0.000016 2025-04-12 22:42:36,804 - INFO - Epoch: 2.13 | loss: 85.441700 | grad_norm: 380.509674 | learning_rate: 0.000016 2025-04-12 22:42:48,892 - INFO - Epoch: 2.13 | loss: 79.701000 | grad_norm: 405.835449 | learning_rate: 0.000016 2025-04-12 22:43:00,821 - INFO - Epoch: 2.14 | loss: 80.654000 | grad_norm: 345.419373 | learning_rate: 0.000016 2025-04-12 22:43:12,643 - INFO - Epoch: 2.14 | loss: 76.878200 | grad_norm: 288.184937 | learning_rate: 0.000016 2025-04-12 22:43:24,462 - INFO - Epoch: 2.15 | loss: 79.390400 | grad_norm: 225.595886 | learning_rate: 0.000016 2025-04-12 22:43:36,224 - INFO - Epoch: 2.15 | loss: 77.362000 | grad_norm: 233.696045 | learning_rate: 0.000016 2025-04-12 22:43:47,969 - INFO - Epoch: 2.15 | loss: 81.532700 | grad_norm: 322.321594 | learning_rate: 0.000016 2025-04-12 22:43:59,669 - INFO - Epoch: 2.16 | loss: 78.698000 | grad_norm: 265.576172 | learning_rate: 0.000016 2025-04-12 22:44:11,412 - INFO - Epoch: 2.16 | loss: 80.953300 | grad_norm: 337.248596 | learning_rate: 0.000016 2025-04-12 22:44:22,651 - INFO - Epoch: 2.16 | loss: 80.126000 | grad_norm: 315.148926 | learning_rate: 0.000016 2025-04-12 22:44:34,783 - INFO - Epoch: 2.17 | loss: 84.876600 | grad_norm: 469.875977 | learning_rate: 0.000015 2025-04-12 22:44:46,987 - INFO - Epoch: 2.17 | loss: 81.258200 | grad_norm: 230.541748 | learning_rate: 0.000015 2025-04-12 22:44:58,537 - INFO - Epoch: 2.18 | loss: 80.874200 | grad_norm: 294.347260 | learning_rate: 0.000015 2025-04-12 22:45:10,350 - INFO - Epoch: 2.18 | loss: 82.793100 | grad_norm: 585.333496 | learning_rate: 0.000015 2025-04-12 22:45:21,586 - INFO - Epoch: 2.18 | loss: 80.302200 | grad_norm: 404.881592 | learning_rate: 0.000015 2025-04-12 22:45:33,122 - INFO - Epoch: 2.19 | loss: 80.030400 | grad_norm: 255.901764 | learning_rate: 0.000015 2025-04-12 22:45:45,022 - INFO - Epoch: 2.19 | loss: 79.084900 | grad_norm: 221.630920 | learning_rate: 0.000015 2025-04-12 22:45:56,893 - INFO - Epoch: 2.19 | loss: 80.477800 | grad_norm: 324.873322 | learning_rate: 0.000015 2025-04-12 22:46:08,219 - INFO - Epoch: 2.20 | loss: 78.370900 | grad_norm: 409.692200 | learning_rate: 0.000015 2025-04-12 22:46:19,447 - INFO - Epoch: 2.20 | loss: 80.458400 | grad_norm: 241.106293 | learning_rate: 0.000015 2025-04-12 22:46:32,245 - INFO - Epoch: 2.21 | loss: 83.915200 | grad_norm: 601.606506 | learning_rate: 0.000015 2025-04-12 22:46:44,171 - INFO - Epoch: 2.21 | loss: 80.352000 | grad_norm: 472.881561 | learning_rate: 0.000015 2025-04-12 22:46:55,503 - INFO - Epoch: 2.21 | loss: 80.516500 | grad_norm: 1130.814575 | learning_rate: 0.000015 2025-04-12 22:47:07,145 - INFO - Epoch: 2.22 | loss: 84.375200 | grad_norm: 443.796997 | learning_rate: 0.000015 2025-04-12 22:47:18,775 - INFO - Epoch: 2.22 | loss: 82.198600 | grad_norm: 236.134094 | learning_rate: 0.000015 2025-04-12 22:47:30,218 - INFO - Epoch: 2.22 | loss: 83.659500 | grad_norm: 205.515274 | learning_rate: 0.000014 2025-04-12 22:47:42,538 - INFO - Epoch: 2.23 | loss: 81.178200 | grad_norm: 617.364380 | learning_rate: 0.000014 2025-04-12 22:47:53,421 - INFO - Epoch: 2.23 | loss: 75.447200 | grad_norm: 527.122437 | learning_rate: 0.000014 2025-04-12 22:48:05,116 - INFO - Epoch: 2.24 | loss: 78.769300 | grad_norm: 285.769043 | learning_rate: 0.000014 2025-04-12 22:48:16,679 - INFO - Epoch: 2.24 | loss: 75.776700 | grad_norm: 239.125244 | learning_rate: 0.000014 2025-04-12 22:48:28,489 - INFO - Epoch: 2.24 | loss: 78.689700 | grad_norm: 831.748535 | learning_rate: 0.000014 2025-04-12 22:48:40,718 - INFO - Epoch: 2.25 | loss: 82.442500 | grad_norm: 390.881622 | learning_rate: 0.000014 2025-04-12 22:48:52,239 - INFO - Epoch: 2.25 | loss: 77.346400 | grad_norm: 263.647919 | learning_rate: 0.000014 2025-04-12 22:49:04,143 - INFO - Epoch: 2.25 | loss: 85.578600 | grad_norm: 316.670044 | learning_rate: 0.000014 2025-04-12 22:49:15,827 - INFO - Epoch: 2.26 | loss: 76.354300 | grad_norm: 283.049377 | learning_rate: 0.000014 2025-04-12 22:49:27,591 - INFO - Epoch: 2.26 | loss: 83.743300 | grad_norm: 594.266968 | learning_rate: 0.000014 2025-04-12 22:49:39,094 - INFO - Epoch: 2.26 | loss: 81.296100 | grad_norm: 3160.194336 | learning_rate: 0.000014 2025-04-12 22:49:50,676 - INFO - Epoch: 2.27 | loss: 79.596700 | grad_norm: 293.630219 | learning_rate: 0.000014 2025-04-12 22:50:02,199 - INFO - Epoch: 2.27 | loss: 81.400500 | grad_norm: 734.324585 | learning_rate: 0.000014 2025-04-12 22:50:13,947 - INFO - Epoch: 2.28 | loss: 82.226500 | grad_norm: 365.978790 | learning_rate: 0.000013 2025-04-12 22:50:25,612 - INFO - Epoch: 2.28 | loss: 84.841700 | grad_norm: 351.811127 | learning_rate: 0.000013 2025-04-12 22:50:37,295 - INFO - Epoch: 2.28 | loss: 83.054500 | grad_norm: 1040.064941 | learning_rate: 0.000013 2025-04-12 22:50:49,439 - INFO - Epoch: 2.29 | loss: 81.682500 | grad_norm: 320.567627 | learning_rate: 0.000013 2025-04-12 22:51:01,506 - INFO - Epoch: 2.29 | loss: 81.358200 | grad_norm: 379.940887 | learning_rate: 0.000013 2025-04-12 22:51:12,894 - INFO - Epoch: 2.29 | loss: 80.783700 | grad_norm: 338.594879 | learning_rate: 0.000013 2025-04-12 22:51:24,733 - INFO - Epoch: 2.30 | loss: 80.175600 | grad_norm: 276.719788 | learning_rate: 0.000013 2025-04-12 22:51:36,443 - INFO - Epoch: 2.30 | loss: 79.176200 | grad_norm: 321.271545 | learning_rate: 0.000013 2025-04-12 22:51:47,908 - INFO - Epoch: 2.31 | loss: 79.474900 | grad_norm: 299.017639 | learning_rate: 0.000013 2025-04-12 22:51:58,759 - INFO - Epoch: 2.31 | loss: 72.575600 | grad_norm: 361.591309 | learning_rate: 0.000013 2025-04-12 22:52:10,498 - INFO - Epoch: 2.31 | loss: 85.256200 | grad_norm: 254.300293 | learning_rate: 0.000013 2025-04-12 22:52:21,867 - INFO - Epoch: 2.32 | loss: 80.228900 | grad_norm: 1130.182007 | learning_rate: 0.000013 2025-04-12 22:52:33,831 - INFO - Epoch: 2.32 | loss: 83.565600 | grad_norm: 236.835281 | learning_rate: 0.000013 2025-04-12 22:52:45,630 - INFO - Epoch: 2.32 | loss: 79.719900 | grad_norm: 1217.792114 | learning_rate: 0.000013 2025-04-12 22:52:56,390 - INFO - Epoch: 2.33 | loss: 78.042800 | grad_norm: 576.944275 | learning_rate: 0.000013 2025-04-12 22:53:08,383 - INFO - Epoch: 2.33 | loss: 79.520000 | grad_norm: 206.648987 | learning_rate: 0.000012 2025-04-12 22:53:20,637 - INFO - Epoch: 2.34 | loss: 85.018000 | grad_norm: 337.678986 | learning_rate: 0.000012 2025-04-12 22:53:32,355 - INFO - Epoch: 2.34 | loss: 76.794300 | grad_norm: 322.745300 | learning_rate: 0.000012 2025-04-12 22:53:43,374 - INFO - Epoch: 2.34 | loss: 77.995900 | grad_norm: 580.471497 | learning_rate: 0.000012 2025-04-12 22:53:55,318 - INFO - Epoch: 2.35 | loss: 80.209800 | grad_norm: 308.780762 | learning_rate: 0.000012 2025-04-12 22:54:06,808 - INFO - Epoch: 2.35 | loss: 81.868000 | grad_norm: 267.648407 | learning_rate: 0.000012 2025-04-12 22:54:18,943 - INFO - Epoch: 2.35 | loss: 82.722700 | grad_norm: 336.747742 | learning_rate: 0.000012 2025-04-12 22:54:31,046 - INFO - Epoch: 2.36 | loss: 81.315800 | grad_norm: 408.529419 | learning_rate: 0.000012 2025-04-12 22:54:42,736 - INFO - Epoch: 2.36 | loss: 82.680900 | grad_norm: 221.770798 | learning_rate: 0.000012 2025-04-12 22:54:54,857 - INFO - Epoch: 2.37 | loss: 76.805600 | grad_norm: 359.903320 | learning_rate: 0.000012 2025-04-12 22:55:06,147 - INFO - Epoch: 2.37 | loss: 79.758200 | grad_norm: 222.399841 | learning_rate: 0.000012 2025-04-12 22:55:17,277 - INFO - Epoch: 2.37 | loss: 76.123600 | grad_norm: 384.640015 | learning_rate: 0.000012 2025-04-12 22:55:28,804 - INFO - Epoch: 2.38 | loss: 80.444000 | grad_norm: 351.624512 | learning_rate: 0.000012 2025-04-12 22:55:40,155 - INFO - Epoch: 2.38 | loss: 75.389800 | grad_norm: 215.531235 | learning_rate: 0.000012 2025-04-12 22:55:52,134 - INFO - Epoch: 2.38 | loss: 81.802200 | grad_norm: 290.633850 | learning_rate: 0.000011 2025-04-12 22:56:03,917 - INFO - Epoch: 2.39 | loss: 79.885000 | grad_norm: 335.583832 | learning_rate: 0.000011 2025-04-12 22:56:15,534 - INFO - Epoch: 2.39 | loss: 83.488800 | grad_norm: 524.313293 | learning_rate: 0.000011 2025-04-12 22:56:27,302 - INFO - Epoch: 2.40 | loss: 76.160900 | grad_norm: 280.490143 | learning_rate: 0.000011 2025-04-12 22:56:39,667 - INFO - Epoch: 2.40 | loss: 82.703500 | grad_norm: 330.406982 | learning_rate: 0.000011 2025-04-12 22:56:52,158 - INFO - Epoch: 2.40 | loss: 82.738600 | grad_norm: 362.059906 | learning_rate: 0.000011 2025-04-12 22:57:04,224 - INFO - Epoch: 2.41 | loss: 82.856800 | grad_norm: 230.223984 | learning_rate: 0.000011 2025-04-12 22:57:15,387 - INFO - Epoch: 2.41 | loss: 78.894600 | grad_norm: 649.764709 | learning_rate: 0.000011 2025-04-12 22:57:27,282 - INFO - Epoch: 2.41 | loss: 80.565400 | grad_norm: 257.538971 | learning_rate: 0.000011 2025-04-12 22:57:38,743 - INFO - Epoch: 2.42 | loss: 77.775400 | grad_norm: 244.083084 | learning_rate: 0.000011 2025-04-12 22:57:50,445 - INFO - Epoch: 2.42 | loss: 79.103300 | grad_norm: 300.497650 | learning_rate: 0.000011 2025-04-12 22:58:01,546 - INFO - Epoch: 2.43 | loss: 78.067100 | grad_norm: 427.376617 | learning_rate: 0.000011 2025-04-12 22:58:13,367 - INFO - Epoch: 2.43 | loss: 79.984300 | grad_norm: 255.419357 | learning_rate: 0.000011 2025-04-12 22:58:25,390 - INFO - Epoch: 2.43 | loss: 78.005500 | grad_norm: 324.143127 | learning_rate: 0.000011 2025-04-12 22:58:36,696 - INFO - Epoch: 2.44 | loss: 79.522200 | grad_norm: 422.716766 | learning_rate: 0.000011 2025-04-12 22:58:47,835 - INFO - Epoch: 2.44 | loss: 75.562900 | grad_norm: 1727.152466 | learning_rate: 0.000010 2025-04-12 22:58:58,853 - INFO - Epoch: 2.44 | loss: 78.348000 | grad_norm: 438.988037 | learning_rate: 0.000010 2025-04-12 22:59:10,164 - INFO - Epoch: 2.45 | loss: 79.257500 | grad_norm: 684.437927 | learning_rate: 0.000010 2025-04-12 22:59:22,058 - INFO - Epoch: 2.45 | loss: 80.178600 | grad_norm: 243.208450 | learning_rate: 0.000010 2025-04-12 22:59:33,979 - INFO - Epoch: 2.46 | loss: 79.419600 | grad_norm: 347.409668 | learning_rate: 0.000010 2025-04-12 22:59:46,255 - INFO - Epoch: 2.46 | loss: 80.947900 | grad_norm: 212.803741 | learning_rate: 0.000010 2025-04-12 22:59:58,278 - INFO - Epoch: 2.46 | loss: 80.992600 | grad_norm: 346.166992 | learning_rate: 0.000010 2025-04-12 23:00:10,209 - INFO - Epoch: 2.47 | loss: 83.501800 | grad_norm: 639.984192 | learning_rate: 0.000010 2025-04-12 23:00:21,730 - INFO - Epoch: 2.47 | loss: 79.770800 | grad_norm: 351.880432 | learning_rate: 0.000010 2025-04-12 23:00:33,602 - INFO - Epoch: 2.47 | loss: 79.672600 | grad_norm: 420.153351 | learning_rate: 0.000010 2025-04-12 23:00:45,462 - INFO - Epoch: 2.48 | loss: 87.085200 | grad_norm: 331.008331 | learning_rate: 0.000010 2025-04-12 23:00:57,701 - INFO - Epoch: 2.48 | loss: 80.169200 | grad_norm: 277.840942 | learning_rate: 0.000010 2025-04-12 23:01:09,698 - INFO - Epoch: 2.48 | loss: 83.601400 | grad_norm: 316.333618 | learning_rate: 0.000010 2025-04-12 23:01:21,034 - INFO - Epoch: 2.49 | loss: 82.822000 | grad_norm: 566.374512 | learning_rate: 0.000010 2025-04-12 23:01:32,550 - INFO - Epoch: 2.49 | loss: 75.534400 | grad_norm: 290.067657 | learning_rate: 0.000009 2025-04-12 23:01:44,573 - INFO - Epoch: 2.50 | loss: 81.222400 | grad_norm: 312.430084 | learning_rate: 0.000009 2025-04-12 23:01:57,120 - INFO - Epoch: 2.50 | loss: 78.656700 | grad_norm: 409.130646 | learning_rate: 0.000009 2025-04-12 23:02:08,516 - INFO - Epoch: 2.50 | loss: 78.810300 | grad_norm: 433.086761 | learning_rate: 0.000009 2025-04-12 23:02:20,261 - INFO - Epoch: 2.51 | loss: 80.196500 | grad_norm: 328.224701 | learning_rate: 0.000009 2025-04-12 23:02:31,662 - INFO - Epoch: 2.51 | loss: 76.789600 | grad_norm: 464.638458 | learning_rate: 0.000009 2025-04-12 23:02:43,347 - INFO - Epoch: 2.51 | loss: 76.980800 | grad_norm: 245.058777 | learning_rate: 0.000009 2025-04-12 23:02:55,067 - INFO - Epoch: 2.52 | loss: 79.419800 | grad_norm: 327.786407 | learning_rate: 0.000009 2025-04-12 23:03:07,157 - INFO - Epoch: 2.52 | loss: 82.319200 | grad_norm: 248.351364 | learning_rate: 0.000009 2025-04-12 23:03:19,216 - INFO - Epoch: 2.53 | loss: 81.233400 | grad_norm: 252.880936 | learning_rate: 0.000009 2025-04-12 23:03:30,473 - INFO - Epoch: 2.53 | loss: 80.764300 | grad_norm: 221.946640 | learning_rate: 0.000009 2025-04-12 23:03:42,377 - INFO - Epoch: 2.53 | loss: 82.942800 | grad_norm: 153.724197 | learning_rate: 0.000009 2025-04-12 23:03:54,323 - INFO - Epoch: 2.54 | loss: 80.070500 | grad_norm: 365.240173 | learning_rate: 0.000009 2025-04-12 23:04:06,113 - INFO - Epoch: 2.54 | loss: 81.862600 | grad_norm: 280.213470 | learning_rate: 0.000009 2025-04-12 23:04:17,886 - INFO - Epoch: 2.54 | loss: 80.498600 | grad_norm: 550.393921 | learning_rate: 0.000009 2025-04-12 23:04:29,629 - INFO - Epoch: 2.55 | loss: 82.363500 | grad_norm: 410.970093 | learning_rate: 0.000008 2025-04-12 23:04:41,323 - INFO - Epoch: 2.55 | loss: 79.259100 | grad_norm: 1392.836548 | learning_rate: 0.000008 2025-04-12 23:04:52,968 - INFO - Epoch: 2.56 | loss: 84.307000 | grad_norm: 312.718719 | learning_rate: 0.000008 2025-04-12 23:05:05,067 - INFO - Epoch: 2.56 | loss: 84.588100 | grad_norm: 484.226776 | learning_rate: 0.000008 2025-04-12 23:05:16,941 - INFO - Epoch: 2.56 | loss: 83.379300 | grad_norm: 1091.417969 | learning_rate: 0.000008 2025-04-12 23:05:29,172 - INFO - Epoch: 2.57 | loss: 79.555200 | grad_norm: 364.298798 | learning_rate: 0.000008 2025-04-12 23:05:40,393 - INFO - Epoch: 2.57 | loss: 80.926100 | grad_norm: 242.976242 | learning_rate: 0.000008 2025-04-12 23:05:52,083 - INFO - Epoch: 2.57 | loss: 80.732600 | grad_norm: 619.802307 | learning_rate: 0.000008 2025-04-12 23:06:04,262 - INFO - Epoch: 2.58 | loss: 82.095900 | grad_norm: 239.875229 | learning_rate: 0.000008 2025-04-12 23:06:16,585 - INFO - Epoch: 2.58 | loss: 81.415800 | grad_norm: 202.226013 | learning_rate: 0.000008 2025-04-12 23:06:28,377 - INFO - Epoch: 2.59 | loss: 76.629000 | grad_norm: 186.231400 | learning_rate: 0.000008 2025-04-12 23:06:40,254 - INFO - Epoch: 2.59 | loss: 80.442100 | grad_norm: 544.025330 | learning_rate: 0.000008 2025-04-12 23:06:52,503 - INFO - Epoch: 2.59 | loss: 80.675000 | grad_norm: 285.504578 | learning_rate: 0.000008 2025-04-12 23:07:05,059 - INFO - Epoch: 2.60 | loss: 82.865900 | grad_norm: 428.836975 | learning_rate: 0.000008 2025-04-12 23:07:16,657 - INFO - Epoch: 2.60 | loss: 82.455100 | grad_norm: 400.082458 | learning_rate: 0.000007 2025-04-12 23:07:28,145 - INFO - Epoch: 2.60 | loss: 79.177600 | grad_norm: 423.301117 | learning_rate: 0.000007 2025-04-12 23:07:39,517 - INFO - Epoch: 2.61 | loss: 78.659500 | grad_norm: 515.202820 | learning_rate: 0.000007 2025-04-12 23:07:51,488 - INFO - Epoch: 2.61 | loss: 81.787800 | grad_norm: 240.426987 | learning_rate: 0.000007 2025-04-12 23:08:02,875 - INFO - Epoch: 2.62 | loss: 76.771200 | grad_norm: 259.338806 | learning_rate: 0.000007 2025-04-12 23:08:14,617 - INFO - Epoch: 2.62 | loss: 82.189100 | grad_norm: 397.407471 | learning_rate: 0.000007 2025-04-12 23:08:26,211 - INFO - Epoch: 2.62 | loss: 76.755300 | grad_norm: 405.365387 | learning_rate: 0.000007 2025-04-12 23:08:37,544 - INFO - Epoch: 2.63 | loss: 76.956900 | grad_norm: 606.735657 | learning_rate: 0.000007 2025-04-12 23:08:48,892 - INFO - Epoch: 2.63 | loss: 79.732000 | grad_norm: 243.972641 | learning_rate: 0.000007 2025-04-12 23:09:00,561 - INFO - Epoch: 2.63 | loss: 78.958700 | grad_norm: 262.013306 | learning_rate: 0.000007 2025-04-12 23:09:11,832 - INFO - Epoch: 2.64 | loss: 75.315500 | grad_norm: 314.028381 | learning_rate: 0.000007 2025-04-12 23:09:23,439 - INFO - Epoch: 2.64 | loss: 82.402300 | grad_norm: 404.651031 | learning_rate: 0.000007 2025-04-12 23:09:34,733 - INFO - Epoch: 2.65 | loss: 80.574200 | grad_norm: 351.466705 | learning_rate: 0.000007 2025-04-12 23:09:46,895 - INFO - Epoch: 2.65 | loss: 79.642300 | grad_norm: 347.181366 | learning_rate: 0.000007 2025-04-12 23:09:58,357 - INFO - Epoch: 2.65 | loss: 74.672500 | grad_norm: 256.862427 | learning_rate: 0.000007 2025-04-12 23:10:10,033 - INFO - Epoch: 2.66 | loss: 82.049600 | grad_norm: 203.703140 | learning_rate: 0.000006 2025-04-12 23:10:22,111 - INFO - Epoch: 2.66 | loss: 79.706500 | grad_norm: 857.007141 | learning_rate: 0.000006 2025-04-12 23:10:33,343 - INFO - Epoch: 2.66 | loss: 80.320100 | grad_norm: 437.926788 | learning_rate: 0.000006 2025-04-12 23:10:44,513 - INFO - Epoch: 2.67 | loss: 76.831500 | grad_norm: 776.444519 | learning_rate: 0.000006 2025-04-12 23:10:56,319 - INFO - Epoch: 2.67 | loss: 81.235200 | grad_norm: 263.825897 | learning_rate: 0.000006 2025-04-12 23:11:07,867 - INFO - Epoch: 2.68 | loss: 78.898400 | grad_norm: 1170.277588 | learning_rate: 0.000006 2025-04-12 23:11:19,554 - INFO - Epoch: 2.68 | loss: 82.194400 | grad_norm: 367.669678 | learning_rate: 0.000006 2025-04-12 23:11:31,010 - INFO - Epoch: 2.68 | loss: 76.540600 | grad_norm: 248.011826 | learning_rate: 0.000006 2025-04-12 23:11:43,356 - INFO - Epoch: 2.69 | loss: 82.314700 | grad_norm: 327.641815 | learning_rate: 0.000006 2025-04-12 23:11:54,884 - INFO - Epoch: 2.69 | loss: 81.987900 | grad_norm: 261.004883 | learning_rate: 0.000006 2025-04-12 23:12:06,348 - INFO - Epoch: 2.69 | loss: 81.608900 | grad_norm: 297.423035 | learning_rate: 0.000006 2025-04-12 23:12:17,728 - INFO - Epoch: 2.70 | loss: 78.239700 | grad_norm: 2046.290405 | learning_rate: 0.000006 2025-04-12 23:12:28,918 - INFO - Epoch: 2.70 | loss: 78.729500 | grad_norm: 256.050964 | learning_rate: 0.000006 2025-04-12 23:12:40,334 - INFO - Epoch: 2.71 | loss: 77.400300 | grad_norm: 220.292068 | learning_rate: 0.000006 2025-04-12 23:12:51,791 - INFO - Epoch: 2.71 | loss: 79.434500 | grad_norm: 257.886566 | learning_rate: 0.000005 2025-04-12 23:13:03,709 - INFO - Epoch: 2.71 | loss: 82.833900 | grad_norm: 203.510208 | learning_rate: 0.000005 2025-04-12 23:13:14,995 - INFO - Epoch: 2.72 | loss: 78.962700 | grad_norm: 377.603638 | learning_rate: 0.000005 2025-04-12 23:13:26,722 - INFO - Epoch: 2.72 | loss: 80.185200 | grad_norm: 1783.523560 | learning_rate: 0.000005 2025-04-12 23:13:38,164 - INFO - Epoch: 2.72 | loss: 76.907600 | grad_norm: 435.501556 | learning_rate: 0.000005 2025-04-12 23:13:50,832 - INFO - Epoch: 2.73 | loss: 85.946000 | grad_norm: 322.672058 | learning_rate: 0.000005 2025-04-12 23:14:02,194 - INFO - Epoch: 2.73 | loss: 75.305300 | grad_norm: 465.942566 | learning_rate: 0.000005 2025-04-12 23:14:13,299 - INFO - Epoch: 2.73 | loss: 78.587800 | grad_norm: 455.385468 | learning_rate: 0.000005 2025-04-12 23:14:24,821 - INFO - Epoch: 2.74 | loss: 77.549700 | grad_norm: 343.252594 | learning_rate: 0.000005 2025-04-12 23:14:35,824 - INFO - Epoch: 2.74 | loss: 79.096600 | grad_norm: 313.555511 | learning_rate: 0.000005 2025-04-12 23:14:47,025 - INFO - Epoch: 2.75 | loss: 73.799100 | grad_norm: 297.449768 | learning_rate: 0.000005 2025-04-12 23:14:59,254 - INFO - Epoch: 2.75 | loss: 83.589900 | grad_norm: 278.167419 | learning_rate: 0.000005 2025-04-12 23:15:10,339 - INFO - Epoch: 2.75 | loss: 81.149100 | grad_norm: 313.655273 | learning_rate: 0.000005 2025-04-12 23:15:22,089 - INFO - Epoch: 2.76 | loss: 77.579100 | grad_norm: 442.008179 | learning_rate: 0.000005 2025-04-12 23:15:33,471 - INFO - Epoch: 2.76 | loss: 79.631700 | grad_norm: 369.895142 | learning_rate: 0.000005 2025-04-12 23:15:44,776 - INFO - Epoch: 2.76 | loss: 74.956500 | grad_norm: 298.198547 | learning_rate: 0.000004 2025-04-12 23:15:55,875 - INFO - Epoch: 2.77 | loss: 80.171900 | grad_norm: 1153.740356 | learning_rate: 0.000004 2025-04-12 23:16:07,108 - INFO - Epoch: 2.77 | loss: 75.521000 | grad_norm: 351.966400 | learning_rate: 0.000004 2025-04-12 23:16:18,465 - INFO - Epoch: 2.78 | loss: 74.157900 | grad_norm: 368.717010 | learning_rate: 0.000004 2025-04-12 23:16:30,676 - INFO - Epoch: 2.78 | loss: 80.424100 | grad_norm: 757.187256 | learning_rate: 0.000004 2025-04-12 23:16:42,756 - INFO - Epoch: 2.78 | loss: 89.403900 | grad_norm: 249.584763 | learning_rate: 0.000004 2025-04-12 23:16:54,594 - INFO - Epoch: 2.79 | loss: 80.611500 | grad_norm: 253.722397 | learning_rate: 0.000004 2025-04-12 23:17:06,425 - INFO - Epoch: 2.79 | loss: 77.212100 | grad_norm: 232.492065 | learning_rate: 0.000004 2025-04-12 23:17:18,103 - INFO - Epoch: 2.79 | loss: 77.506400 | grad_norm: 992.415283 | learning_rate: 0.000004 2025-04-12 23:17:29,909 - INFO - Epoch: 2.80 | loss: 79.990800 | grad_norm: 416.971710 | learning_rate: 0.000004 2025-04-12 23:17:41,334 - INFO - Epoch: 2.80 | loss: 78.426500 | grad_norm: 249.799988 | learning_rate: 0.000004 2025-04-12 23:17:52,717 - INFO - Epoch: 2.81 | loss: 80.999700 | grad_norm: 581.044800 | learning_rate: 0.000004 2025-04-12 23:18:05,046 - INFO - Epoch: 2.81 | loss: 80.712700 | grad_norm: 258.616364 | learning_rate: 0.000004 2025-04-12 23:18:16,224 - INFO - Epoch: 2.81 | loss: 74.856800 | grad_norm: 329.834656 | learning_rate: 0.000004 2025-04-12 23:18:28,075 - INFO - Epoch: 2.82 | loss: 78.513900 | grad_norm: 267.077393 | learning_rate: 0.000003 2025-04-12 23:18:38,495 - INFO - Epoch: 2.82 | loss: 77.156500 | grad_norm: 467.333221 | learning_rate: 0.000003 2025-04-12 23:18:50,455 - INFO - Epoch: 2.82 | loss: 79.790300 | grad_norm: 424.596588 | learning_rate: 0.000003 2025-04-12 23:19:01,701 - INFO - Epoch: 2.83 | loss: 78.019400 | grad_norm: 278.298431 | learning_rate: 0.000003 2025-04-12 23:19:13,622 - INFO - Epoch: 2.83 | loss: 81.020500 | grad_norm: 1730.070679 | learning_rate: 0.000003 2025-04-12 23:19:25,828 - INFO - Epoch: 2.84 | loss: 82.310400 | grad_norm: 449.321686 | learning_rate: 0.000003 2025-04-12 23:19:37,594 - INFO - Epoch: 2.84 | loss: 78.182700 | grad_norm: 312.809174 | learning_rate: 0.000003 2025-04-12 23:19:49,537 - INFO - Epoch: 2.84 | loss: 83.078700 | grad_norm: 1833.748169 | learning_rate: 0.000003 2025-04-12 23:20:01,625 - INFO - Epoch: 2.85 | loss: 81.809700 | grad_norm: 1671.613037 | learning_rate: 0.000003 2025-04-12 23:20:13,570 - INFO - Epoch: 2.85 | loss: 78.227500 | grad_norm: 255.329102 | learning_rate: 0.000003 2025-04-12 23:20:24,270 - INFO - Epoch: 2.85 | loss: 78.572300 | grad_norm: 559.833740 | learning_rate: 0.000003 2025-04-12 23:20:36,095 - INFO - Epoch: 2.86 | loss: 80.275900 | grad_norm: 654.349182 | learning_rate: 0.000003 2025-04-12 23:20:47,381 - INFO - Epoch: 2.86 | loss: 80.333200 | grad_norm: 366.905273 | learning_rate: 0.000003 2025-04-12 23:20:59,250 - INFO - Epoch: 2.87 | loss: 80.924600 | grad_norm: 542.957092 | learning_rate: 0.000003 2025-04-12 23:21:10,581 - INFO - Epoch: 2.87 | loss: 78.139700 | grad_norm: 273.627380 | learning_rate: 0.000003 2025-04-12 23:21:22,619 - INFO - Epoch: 2.87 | loss: 80.985200 | grad_norm: 452.521606 | learning_rate: 0.000002 2025-04-12 23:21:34,324 - INFO - Epoch: 2.88 | loss: 82.954400 | grad_norm: 266.111725 | learning_rate: 0.000002 2025-04-12 23:21:45,528 - INFO - Epoch: 2.88 | loss: 78.400900 | grad_norm: 375.694946 | learning_rate: 0.000002 2025-04-12 23:21:57,267 - INFO - Epoch: 2.88 | loss: 76.004400 | grad_norm: 357.928345 | learning_rate: 0.000002 2025-04-12 23:22:09,399 - INFO - Epoch: 2.89 | loss: 87.577600 | grad_norm: 1232.872925 | learning_rate: 0.000002 2025-04-12 23:22:21,643 - INFO - Epoch: 2.89 | loss: 82.131500 | grad_norm: 2075.977295 | learning_rate: 0.000002 2025-04-12 23:22:33,033 - INFO - Epoch: 2.90 | loss: 77.038000 | grad_norm: 478.623901 | learning_rate: 0.000002 2025-04-12 23:22:44,785 - INFO - Epoch: 2.90 | loss: 79.255000 | grad_norm: 316.742279 | learning_rate: 0.000002 2025-04-12 23:22:56,365 - INFO - Epoch: 2.90 | loss: 80.573400 | grad_norm: 342.316162 | learning_rate: 0.000002 2025-04-12 23:23:07,929 - INFO - Epoch: 2.91 | loss: 77.238900 | grad_norm: 242.371933 | learning_rate: 0.000002 2025-04-12 23:23:19,894 - INFO - Epoch: 2.91 | loss: 78.767100 | grad_norm: 619.030396 | learning_rate: 0.000002 2025-04-12 23:23:31,678 - INFO - Epoch: 2.91 | loss: 84.104900 | grad_norm: 342.939728 | learning_rate: 0.000002 2025-04-12 23:23:42,735 - INFO - Epoch: 2.92 | loss: 76.489600 | grad_norm: 344.625793 | learning_rate: 0.000002 2025-04-12 23:23:54,603 - INFO - Epoch: 2.92 | loss: 78.906400 | grad_norm: 666.238403 | learning_rate: 0.000002 2025-04-12 23:24:05,970 - INFO - Epoch: 2.93 | loss: 78.459700 | grad_norm: 293.031403 | learning_rate: 0.000001 2025-04-12 23:24:17,499 - INFO - Epoch: 2.93 | loss: 81.440800 | grad_norm: 509.930389 | learning_rate: 0.000001 2025-04-12 23:24:28,478 - INFO - Epoch: 2.93 | loss: 75.467100 | grad_norm: 281.382324 | learning_rate: 0.000001 2025-04-12 23:24:40,307 - INFO - Epoch: 2.94 | loss: 80.108100 | grad_norm: 392.653381 | learning_rate: 0.000001 2025-04-12 23:24:52,007 - INFO - Epoch: 2.94 | loss: 80.265300 | grad_norm: 378.590881 | learning_rate: 0.000001 2025-04-12 23:25:03,399 - INFO - Epoch: 2.94 | loss: 76.062000 | grad_norm: 244.041885 | learning_rate: 0.000001 2025-04-12 23:25:14,842 - INFO - Epoch: 2.95 | loss: 77.125000 | grad_norm: 287.468445 | learning_rate: 0.000001 2025-04-12 23:25:26,296 - INFO - Epoch: 2.95 | loss: 76.246000 | grad_norm: 789.617798 | learning_rate: 0.000001 2025-04-12 23:25:37,543 - INFO - Epoch: 2.95 | loss: 83.610200 | grad_norm: 2048.908936 | learning_rate: 0.000001 2025-04-12 23:25:49,070 - INFO - Epoch: 2.96 | loss: 79.881600 | grad_norm: 322.371307 | learning_rate: 0.000001 2025-04-12 23:26:00,589 - INFO - Epoch: 2.96 | loss: 77.664300 | grad_norm: 1425.309814 | learning_rate: 0.000001 2025-04-12 23:26:12,935 - INFO - Epoch: 2.97 | loss: 81.058300 | grad_norm: 346.167603 | learning_rate: 0.000001 2025-04-12 23:26:24,289 - INFO - Epoch: 2.97 | loss: 83.264400 | grad_norm: 283.395416 | learning_rate: 0.000001 2025-04-12 23:26:36,156 - INFO - Epoch: 2.97 | loss: 80.940800 | grad_norm: 446.404724 | learning_rate: 0.000001 2025-04-12 23:26:48,091 - INFO - Epoch: 2.98 | loss: 79.208800 | grad_norm: 347.265259 | learning_rate: 0.000001 2025-04-12 23:26:59,800 - INFO - Epoch: 2.98 | loss: 78.039300 | grad_norm: 231.173203 | learning_rate: 0.000000 2025-04-12 23:27:11,241 - INFO - Epoch: 2.98 | loss: 76.332100 | grad_norm: 233.059311 | learning_rate: 0.000000 2025-04-12 23:27:22,184 - INFO - Epoch: 2.99 | loss: 75.756800 | grad_norm: 554.188965 | learning_rate: 0.000000 2025-04-12 23:27:33,906 - INFO - Epoch: 2.99 | loss: 83.428500 | grad_norm: 336.550995 | learning_rate: 0.000000 2025-04-12 23:27:45,037 - INFO - Epoch: 3.00 | loss: 77.007400 | grad_norm: 287.872101 | learning_rate: 0.000000 2025-04-12 23:27:57,042 - INFO - Epoch: 3.00 | loss: 82.761700 | grad_norm: 224.624268 | learning_rate: 0.000000 2025-04-12 23:27:59,920 - INFO - Epoch: 3.00 | train_runtime: 9400.695500 | train_samples_per_second: 13.687000 | train_steps_per_second: 0.855000 | total_flos: 0.000000 | train_loss: 85.018167 2025-04-12 23:28:00,061 - INFO - Training complete. Attempting to save final model... 2025-04-12 23:28:00,061 - INFO - Attempting standard model save... 2025-04-12 23:28:01,072 - INFO - Model successfully saved with standard method to gliner_finetuned_v3 2025-04-12 23:28:01,072 - INFO - Model successfully saved to gliner_finetuned_v3 2025-04-12 23:28:01,072 - INFO - Testing the saved model... 2025-04-12 23:28:01,072 - INFO - Testing the saved model... 2025-04-12 23:28:04,698 - INFO - Model loaded successfully with standard method 2025-04-12 23:28:04,698 - INFO - Running prediction on test text... 2025-04-12 23:28:07,328 - INFO - Predicted entities: 2025-04-12 23:28:07,328 - INFO - Ola Nordmann => PERSON 2025-04-12 23:28:07,328 - INFO - 15.04.2025 => DATE_TIME 2025-04-12 23:28:07,328 - INFO - Kari Hansen => PERSON 2025-04-12 23:28:07,328 - INFO - Storgata 123, Oslo => NO_ADDRESS 2025-04-12 23:28:07,328 - INFO - +47 98765432 => NO_PHONE_NUMBER 2025-04-12 23:28:07,328 - INFO - kari.hansen@example.no => EMAIL_ADDRESS 2025-04-12 23:28:07,328 - INFO - sesongallergi => HEALTH_INFO 2025-04-12 23:28:07,328 - INFO - Model testing completed successfully! 2025-04-12 23:28:07,502 - INFO - Performing detailed evaluation... 2025-04-12 23:28:07,502 - INFO - Performing detailed model evaluation... 2025-04-12 23:31:01,119 - INFO - Evaluation Results: 2025-04-12 23:31:01,120 - INFO - Entity Type Precision Recall F1 Score Support 2025-04-12 23:31:01,120 - INFO - AGE 0.0000 0.0000 0.0000 0 2025-04-12 23:31:01,120 - INFO - AGE_INFO 0.0000 0.0000 0.0000 0 2025-04-12 23:31:01,120 - INFO - ANIMAL_INFO 0.0000 0.0000 0.0000 0 2025-04-12 23:31:01,120 - INFO - BEHAVIORAL_PATTERN 0.0086 0.0076 0.0081 1572 2025-04-12 23:31:01,120 - INFO - CONTEXT_SENSITIVE 0.0014 0.0009 0.0011 2140 2025-04-12 23:31:01,120 - INFO - CRIMINAL_RECORD 0.0005 0.0006 0.0005 3509 2025-04-12 23:31:01,120 - INFO - DATE_TIME 0.0001 0.0002 0.0001 5754 2025-04-12 23:31:01,120 - INFO - ECONOMIC_STATUS 0.0066 0.0055 0.0060 1453 2025-04-12 23:31:01,120 - INFO - EMAIL_ADDRESS 0.0000 0.0000 0.0000 3063 2025-04-12 23:31:01,120 - INFO - EMPLOYMENT_INFO 0.0000 0.0000 0.0000 2142 2025-04-12 23:31:01,120 - INFO - FAMILY_RELATION 0.0044 0.0050 0.0047 2603 2025-04-12 23:31:01,120 - INFO - FINANCIAL_INFO 0.0000 0.0000 0.0000 2121 2025-04-12 23:31:01,120 - INFO - GOV_ID 0.0000 0.0000 0.0000 2226 2025-04-12 23:31:01,120 - INFO - HEALTH_INFO 0.0007 0.0008 0.0008 4968 2025-04-12 23:31:01,120 - INFO - IDENTIFIABLE_IMAGE 0.0000 0.0000 0.0000 1451 2025-04-12 23:31:01,120 - INFO - NO_ADDRESS 0.0000 0.0000 0.0000 4254 2025-04-12 23:31:01,120 - INFO - NO_PHONE_NUMBER 0.0000 0.0000 0.0000 3084 2025-04-12 23:31:01,120 - INFO - PERSON 0.0002 0.0002 0.0002 8025 2025-04-12 23:31:01,120 - INFO - POLITICAL_CASE 0.0010 0.0010 0.0010 1919 2025-04-12 23:31:01,120 - INFO - POSTAL_CODE 0.0299 0.0324 0.0311 710 2025-04-12 23:31:01,120 - INFO - SEXUAL_ORIENTATION 0.0083 0.0084 0.0083 1197 2025-04-12 23:31:01,120 - INFO - -------------------------------------------------------------------------------- 2025-04-12 23:31:01,120 - INFO - Overall 0.0013 0.0015 0.0014 52191 2025-04-12 23:31:01,121 - WARNING - Could not create confusion matrix visualization: No module named 'matplotlib' 2025-04-12 23:31:01,121 - INFO - Evaluation results saved to gliner_finetuned_v3/evaluation_results.json 2025-04-12 23:31:01,121 - INFO - Model testing completed successfully! 2025-04-12 23:31:01,121 - INFO - --------------------------------------- 2025-04-12 23:31:01,121 - INFO - Final Model Evaluation Test 2025-04-12 23:31:01,121 - INFO - --------------------------------------- 2025-04-12 23:31:05,037 - INFO - Running comprehensive entity detection test... 2025-04-12 23:31:11,532 - INFO - Detected entities in comprehensive test: 2025-04-12 23:31:11,532 - INFO - Ola Nordmann => PERSON (confidence: 0.995) 2025-04-12 23:31:11,533 - INFO - 22 40 00 00 => NO_PHONE_NUMBER (confidence: 0.986) 2025-04-12 23:31:11,533 - INFO - postmottak@mattilsynet.no => EMAIL_ADDRESS (confidence: 0.974) 2025-04-12 23:31:11,533 - INFO - Felles postmottak, Postboks 383 => NO_ADDRESS (confidence: 0.760) 2025-04-12 23:31:11,533 - INFO - 2381 => POSTAL_CODE (confidence: 0.541) 2025-04-12 23:31:11,533 - INFO - Grønn Glede Planteskole => CONTEXT_SENSITIVE (confidence: 0.572) 2025-04-12 23:31:11,533 - INFO - 22. februar 2025 => DATE_TIME (confidence: 0.975) 2025-04-12 23:31:11,533 - INFO - Førsteinspektør => EMPLOYMENT_INFO (confidence: 0.735) 2025-04-12 23:31:11,533 - INFO - Kari Hansen => PERSON (confidence: 0.942) 2025-04-12 23:31:11,533 - INFO - seniorinspektør => EMPLOYMENT_INFO (confidence: 0.795) 2025-04-12 23:31:11,533 - INFO - Eier og daglig leder => EMPLOYMENT_INFO (confidence: 0.710) 2025-04-12 23:31:11,533 - INFO - Per Olsen => PERSON (confidence: 0.938) 2025-04-12 23:31:11,533 - INFO - 963789361 => CONTEXT_SENSITIVE (confidence: 0.474) 2025-04-12 23:31:11,533 - INFO - mistenkelig import av planter uten nødvendig plantesunnhetssertifikat => CONTEXT_SENSITIVE (confidence: 0.511) 2025-04-12 23:31:11,533 - INFO - leverandørkjeder => CONTEXT_SENSITIVE (confidence: 0.409) 2025-04-12 23:31:11,533 - INFO - finansielle transaksjoner for planteanskaffelser => CONTEXT_SENSITIVE (confidence: 0.416) 2025-04-12 23:31:11,533 - INFO - medlemskap i bransjeorganisasjoner som garanterer etisk handel => CONTEXT_SENSITIVE (confidence: 0.546) 2025-04-12 23:31:11,533 - INFO - inkonsistenser i økonomisk dokumentasjon knyttet til planteinnkjøp => CONTEXT_SENSITIVE (confidence: 0.422) 2025-04-12 23:31:11,533 - INFO - udeklarert import => CONTEXT_SENSITIVE (confidence: 0.469) 2025-04-12 23:31:11,533 - INFO - fakturaer => FINANCIAL_INFO (confidence: 0.551) 2025-04-12 23:31:11,533 - INFO - betalingsbekreftelser => FINANCIAL_INFO (confidence: 0.481) 2025-04-12 23:31:11,533 - INFO - Training and evaluation completed. Model is ready for use. 2025-04-12 23:31:11,533 - INFO - =============================================== 2025-04-12 23:31:11,533 - INFO - GLiNER training process complete. Output directory: gliner_finetuned_v3 2025-04-12 23:31:11,534 - INFO - ===============================================