Commit
·
cbe0420
1
Parent(s):
fa2337c
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,69 @@
|
|
1 |
---
|
2 |
license: openrail
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: openrail
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
metrics:
|
6 |
+
- f1
|
7 |
+
library_name: fairseq
|
8 |
+
pipeline_tag: audio-classification
|
9 |
---
|
10 |
+
# Model Card for Model ID
|
11 |
+
|
12 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
13 |
+
|
14 |
+
We explore benefits of unsupervised pretraining of wav2vec 2.0 (W2V2) using large-scale unlabeled home recordings collected using LittleBeats and LENA (Language Environment Analysis) devices.
|
15 |
+
LittleBeats (LB) is a new infant wearable multi-modal device that we developed, which simultaneously records audio, movement of the infant, as well as heart-rate variablity.
|
16 |
+
We use W2V2 to advance LB audio pipeline such that it automatically provides reliable labels of speaker diarization and vocalization classifications for family members, including infants, parents, and siblings, at home.
|
17 |
+
We show that W2V2 pretrained on thousands hours of large-scale unlabeled home audio outperforms oracle W2V2 pretrained on 52k-hours released by Facebook/Meta in terms of automatic family audio analysis tasks.
|
18 |
+
|
19 |
+
# Model Details
|
20 |
+
|
21 |
+
## Model Description
|
22 |
+
|
23 |
+
<!-- Provide a longer summary of what this model is. -->
|
24 |
+
Two versions of pretrained W2V2 models are available:
|
25 |
+
|
26 |
+
- **LB1100/checkpoint_best.pt** pretrained using 1100-hour of LB home recordings collected from 110 families of children under 5-year-old
|
27 |
+
- **LL4300/checkpoint_best.pt** pretrained using 1100-hour of LB home recordings collected from 110 families + 3200-hour of LENA home recordings from 275 families of children under 5-year-old
|
28 |
+
|
29 |
+
## Model Sources [optional]
|
30 |
+
|
31 |
+
<!-- Provide the basic links for the model. -->
|
32 |
+
For more information regarding this model, please checkout our paper
|
33 |
+
- **Paper [optional]:** [More Information Needed]
|
34 |
+
|
35 |
+
# Uses
|
36 |
+
We develop fine-tuning recipe using SpeechBrain toolkit available at
|
37 |
+
|
38 |
+
- **Repository:** https://github.com/jialuli3/speechbrain/tree/infant-voc-classification/recipes/wav2vec_kic
|
39 |
+
|
40 |
+
|
41 |
+
## Quick Start [optional]
|
42 |
+
|
43 |
+
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
44 |
+
If you wish to use fairseq framework, the following code snippet can be used to load the pretrained model
|
45 |
+
|
46 |
+
[More Information Needed]
|
47 |
+
|
48 |
+
# Evaluation
|
49 |
+
|
50 |
+
<!-- This section describes the evaluation protocols and provides the results. -->
|
51 |
+
We test 4 unlabeled datasets on unsupervised pretrained W2V2-base models:
|
52 |
+
- **base (oracle version):** originally released version pretrained on ~52k-hour unlabeled audio
|
53 |
+
- **Libri960h:** oracle version fine-tuned using 960h Librispeech
|
54 |
+
- **LB1100h:** pretrain W2V2 using 1100h LB home recordings
|
55 |
+
- **LL4300h:** pretrain W2V2 using 4300h LB+LENA home recordings
|
56 |
+
We then fine-tune pretrained models on 11.7h of LB labeled home recordings, the f1 scores across three tasks are
|
57 |
+
|
58 |
+
|
59 |
+
# Citation
|
60 |
+
|
61 |
+
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
62 |
+
If you found this model helpful to you, please cite us as
|
63 |
+
**BibTeX:**
|
64 |
+
|
65 |
+
# Model Card Contact
|
66 |
+
Jialu Li (she, her, hers)
|
67 |
+
Ph.D candidate @ Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign
|
68 |
+
E-mail: [email protected]
|
69 |
+
Homepage: https://sites.google.com/view/jialuli/
|