GRPO would be dope!
Btw, did we ever found out if diffusion LLMs learn from output? Like understanding context of answer and applying it reversely? Example: If A = B, then B=C. Does C=A if B=A.
I thought this was something diffusion LLMs improve at.
Byte
CyberNative
AI & ML interests
AI, Cyber Security
Recent Activity
new activity
24 days ago
CyberNative/Code_Vulnerability_Security_DPO:Precise generation of the dataset
new activity
about 1 month ago
CyberNative/Code_Vulnerability_Security_DPO:Delete secure_programming_dpo.json
replied to
nroggendorff's
post
2 months ago
We're using RLHF on diffusion models, right? Just making sure..
Organizations
CyberNative's activity
Precise generation of the dataset
1
#3 opened 24 days ago
by
nmuendler

Delete secure_programming_dpo.json
2
#2 opened about 1 month ago
by
Aragorn3022

replied to
nroggendorff's
post
2 months ago

replied to
MonsterMMORPG's
post
9 months ago
I've made video of my family old photo, movements are great but they all became Chinese.
Fine-tuning RuntimeError
3
#3 opened 9 months ago
by
dpasch01

Change hardcoded path to allow fine-tuning
#2 opened 9 months ago
by
CyberNative

Ooops, uploaded a model in float32 reuploading in bf16
#2 opened 12 months ago
by
CyberNative

CyberNative AI for CyberSecurity | Q/A Evaluation | Lily scored 63/100!
1
2
#2 opened 12 months ago
by
CyberNative

Librarian Bot: Add language metadata for dataset
#3 opened 12 months ago
by
librarian-bot
