ibm-granite
/

granite-3.2-8b-instruct

@@ -191,7 +191,7 @@ So, you need to add 10 liters of a 70% acid solution to the 10 liters of a 30% a
 **Evaluation Results:**
 <table>
 <thead>
   <tr>
     <th style="text-align:left; background-color: #001d6c; color: white;">Models</th>
@@ -309,7 +309,7 @@ So, you need to add 10 liters of a 70% acid solution to the 10 liters of a 30% a
   <tr>
       <td style="text-align:left; background-color: #DAE8FF; color: black;">Granite-3.2-2B-Instruct</td>
-    <td style="text-align:center; background-color: #DAE8FF; color: black;">24.86</td>
     <td style="text-align:center; background-color: #DAE8FF; color: black;">34.51</td>
     <td style="text-align:center; background-color: #DAE8FF; color: black;">57.18</td>
     <td style="text-align:center; background-color: #DAE8FF; color: black;">20.56</td>
@@ -340,10 +340,54 @@ So, you need to add 10 liters of a 70% acid solution to the 10 liters of a 30% a
   </tr>
 </tbody></table>
 **Training Data:**
 Overall, our training data is largely comprised of two key sources: (1) publicly available datasets with permissive license, (2) internal synthetically generated data targeted to enhance reasoning capabilites.
 <!-- A detailed attribution of datasets can be found in [Granite 3.2 Technical Report (coming soon)](#), and [Accompanying Author List](https://github.com/ibm-granite/granite-3.0-language-models/blob/main/author-ack.pdf). -->

 **Evaluation Results:**
 <table>
+  <caption><b> Comparison with Other Models</b></caption>
 <thead>
   <tr>
     <th style="text-align:left; background-color: #001d6c; color: white;">Models</th>
   <tr>
       <td style="text-align:left; background-color: #DAE8FF; color: black;">Granite-3.2-2B-Instruct</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">26.6</td>
     <td style="text-align:center; background-color: #DAE8FF; color: black;">34.51</td>
     <td style="text-align:center; background-color: #DAE8FF; color: black;">57.18</td>
     <td style="text-align:center; background-color: #DAE8FF; color: black;">20.56</td>
   </tr>
 </tbody></table>
+<table>
+  <caption><b>Thinking Ablation</b></caption>
+<thead>
+  <tr>
+    <th rowspan="2" style="text-align:left; background-color: #001d6c; color: white;">Models</th>
+    <th colspan="2" style="text-align:center; background-color: #001d6c; color: white;">Thinking=False</th>
+    <th colspan="2" style="text-align:center; background-color: #001d6c; color: white;">Thinking=True</th>
+  </tr>
+  <tr>
+    <th style="text-align:center; background-color: #001d6c; color: white;">ArenaHard</th>
+    <th style="text-align:center; background-color: #001d6c; color: white;">Alpaca-Eval-2</th>
+    <th style="text-align:center; background-color: #001d6c; color: white;">ArenaHard</th>
+    <th style="text-align:center; background-color: #001d6c; color: white;">Alpaca-Eval-2</th>
+  </tr></thead>
+    <tbody>
+         <tr>
+    <td style="text-align:left; background-color: #DAE8FF; color: black;">Granite-3.1-8B-Instruct</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">37.58</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">30.34</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
+        </tr>
+         <tr>
+    <td style="text-align:left; background-color: #DAE8FF; color: black;">Granite-3.1-2B-Instruct</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">23.3</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">27.17</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
+        </tr>
+         <tr>
+    <td style="text-align:left; background-color: #DAE8FF; color: black;">Granite-3.2-2B-Instruct</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">30.42</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">31.65</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">26.6</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">34.51</td>
+        </tr>
+         <tr>
+    <td style="text-align:left; background-color: #DAE8FF; color: black;"><b>Granite-3.2-8B-Instruct</b></td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">40.54</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">36.89</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">55.25</td>
+    <td style="text-align:center; background-color: #DAE8FF; color: black;">61.19</td>
+        </tr>
+    </tbody>
+</table>
 **Training Data:**
 Overall, our training data is largely comprised of two key sources: (1) publicly available datasets with permissive license, (2) internal synthetically generated data targeted to enhance reasoning capabilites.
 <!-- A detailed attribution of datasets can be found in [Granite 3.2 Technical Report (coming soon)](#), and [Accompanying Author List](https://github.com/ibm-granite/granite-3.0-language-models/blob/main/author-ack.pdf). -->