Nasjonalbiblioteket AI Lab

Enterprise

non-profit

Verified

https://ai.nb.no/

NbAiLab

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

versae new activity 8 days ago

NbAiLab/nb-bert-ncc-male2female:Adding `safetensors` variant of this model

pere updated a dataset 9 days ago

NbAiLab/nb_distil_speech_noconcat_stortinget

versae updated a Space 11 days ago

NbAiLab/whisper-sami-demo

View all activity

NbAiLab's activity

versae

in NbAiLab/nb-bert-ncc-male2female 8 days ago

Adding `safetensors` variant of this model

#1 opened 9 days ago by

SFconvertbot

pere

updated a dataset 9 days ago

NbAiLab/nb_distil_speech_noconcat_stortinget

Viewer • Updated 9 days ago • 198k • 434 • 1

versae

updated a Space 11 days ago

Whisper Northern Sámi Demo

🤫

Transcribe audio files and YouTube videos into text

versae

in NbAiLab/whisper-norwegian-small-test 15 days ago

Adding `safetensors` variant of this model

#1 opened 5 months ago by

SFconvertbot

versae

in NbAiLab/notram-bert-norwegian-cased-pod-030421 15 days ago

Adding `safetensors` variant of this model

#1 opened 3 months ago by

SFconvertbot

versae

in NbAiLab/roberta_NCC_des_128 15 days ago

Adding `safetensors` variant of this model

#1 opened 3 months ago by

SFconvertbot

versae

in NbAiLab/whisper-small-nob 15 days ago

Adding `safetensors` variant of this model

#1 opened 3 months ago by

SFconvertbot

versae

in NbAiLab/xls-r-300m-sv2 15 days ago

Adding `safetensors` variant of this model

#1 opened 3 months ago by

SFconvertbot

versae

in NbAiLab/nb-global-mmlu 15 days ago

[bot] Conversion to Parquet

#1 opened 2 months ago by

parquet-converter

versae

in NbAiLab/xls-npsc-oh 15 days ago

Adding `safetensors` variant of this model

#2 opened 2 months ago by

SFconvertbot

versae

in NbAiLab/xls-npsc 15 days ago

Adding `safetensors` variant of this model

#2 opened 2 months ago by

SFconvertbot

versae

in NbAiLab/roberta_des_512 15 days ago

Adding `safetensors` variant of this model

#1 opened 2 months ago by

SFconvertbot

davanstrien

posted an update 16 days ago

Post

1960

Came across a very nice submission from @marcodsn for the reasoning datasets competition (https://huggingface.co/blog/bespokelabs/reasoning-datasets-competition).

The dataset distils reasoning chains from arXiv research papers in biology and economics. Some nice features of the dataset:

- Extracts both the logical structure AND researcher intuition from academic papers
- Adopts the persona of researchers "before experiments" to capture exploratory thinking
- Provides multi-short and single-long reasoning formats with token budgets - Shows 7.2% improvement on MMLU-Pro Economics when fine-tuning a 3B model

It's created using the Curator framework with plans to scale across more scientific domains and incorporate multi-modal reasoning with charts and mathematics.

I personally am very excited about datasets like this, which involve creativity in their creation and don't just rely on $$$ to produce a big dataset with little novelty.

Dataset can be found here: marcodsn/academic-chains (give it a like!)

titae

authored a paper 17 days ago

NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark

Paper • 2504.07749 • Published 29 days ago

versae

in NbAiLab/nb-sau-7b-4k-step100k 29 days ago

Adding `safetensors` variant of this model

#1 opened 29 days ago by

SFconvertbot

davanstrien

posted an update 29 days ago

Post

1662

I've created a v1 dataset ( davanstrien/reasoning-required) and model ( davanstrien/ModernBERT-based-Reasoning-Required) to help curate "wild text" data for generating reasoning examples beyond the usual code/math/science domains.

- I developed a "Reasoning Required" dataset with a 0-4 scoring system for reasoning complexity
- I used educational content from HuggingFaceFW/fineweb-edu, adding annotations for domains, reasoning types, and example questions

My approach enables a more efficient workflow: filter text with small models first, then use LLMs only on high-value content.

This significantly reduces computation costs while expanding reasoning dataset domain coverage.

pere

in NbAiLab/nb-whisper-tiny about 1 month ago

Replace small with tiny

#2 opened about 1 month ago by

PierreMesure

davanstrien

published a Space about 1 month ago

Argilla

✍

davanstrien

posted an update 2 months ago

Post

2933

📊 Introducing "Hugging Face Dataset Spotlight" 📊

I'm excited to share the first episode of our AI-generated podcast series focusing on nice datasets from the Hugging Face Hub!

This first episode explores mathematical reasoning datasets:

- SynthLabsAI/Big-Math-RL-Verified: Over 250,000 rigorously verified problems spanning multiple difficulty levels and mathematical domains
- open-r1/OpenR1-Math-220k: 220,000 math problems with multiple reasoning traces, verified for accuracy using Math Verify and Llama-3.3-70B models.
- facebook/natural_reasoning: 1.1 million general reasoning questions carefully deduplicated and decontaminated from existing benchmarks, showing superior scaling effects when training models like Llama3.1-8B-Instruct.

Plus a bonus segment on bespokelabs/bespoke-manim!

https://www.youtube.com/watch?v=-TgmRq45tW4

davanstrien

posted an update 2 months ago

Post

3682

Quick POC: Turn a Hugging Face dataset card into a short podcast introducing the dataset using all open models.

I think I'm the only weirdo who would enjoy listening to something like this though 😅

Here is an example for eth-nlped/stepverify

2 replies

AI & ML interests

Recent Activity

Team members 15

NbAiLab's activity

Adding `safetensors` variant of this model

Whisper Northern Sámi Demo

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

[bot] Conversion to Parquet

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Replace small with tiny

Argilla