kuleshov-group/bd3lm-owt-block_size16 · How to run huggingface model?

Apr 11

I was able to get the model working with the github code after commenting out "@torch.compile" over the fused flash attention (triton problems dunno?), but I was wondering if there's a better way to run this with diffusers or what?

marriola

Kuleshov Group org about 1 month ago

Can you share more information about your environment to reproduce this issue? (what version of pytorch are you using?)

odellus

about 1 month ago

•

edited about 1 month ago

I followed the instruction to conda create -f requirements.yaml but ended up upgrading the python3.10 after an issue with torchvision throwing errors on import.

name: bd3lm
channels:
  - pytorch
  - anaconda
  - nvidia
  - defaults
dependencies:
  - cuda-nvcc=12.4.99
  - jupyter=1.0.0
  - pip=23.3.1
  - python=3.10
  - pytorch=2.6
  - pip:
      - datasets==2.18.0
      - einops==0.7.0
      - fsspec==2024.2.0
      - git-lfs==1.6
      - h5py==3.10.0
      - hydra-core==1.3.2
      - ipdb==0.13.13
      - lightning==2.2.1
      - notebook==7.1.1
      - nvitop==1.3.2
      - omegaconf==2.3.0
      - packaging==23.2
      - pandas==2.2.1
      - rich==13.7.1
      - seaborn==0.13.2
      - scikit-learn==1.4.0
      - timm==0.9.16
      - transformers==4.38.2
      - triton==2.2.0
      - wandb==0.13.5

Think the torchvision error might have something to do with deprecation of pytorch channel, but not sure.

I was able to get the huggingface model to work! Super exciting stuff. I was really just more curious about roadmap to implement this in a more traditional, HF-flavored route by using diffusers or something, because obviously there's no way to use the built in .generate() with a MaskedLM.

marriola

Kuleshov Group org about 1 month ago

Great to hear! To make the environment setup easier for others, I'll update the setup to install all dependencies through pip instead of using conda channels.

I think it's a great idea to support .generate()in our Block Diffusion models; depending on my bandwidth in the next couple months, my plan is to implement this and release a public notebook with example usage. Will update you if/when that happens. We will definitely support this if we end up releasing bigger models in the future

marriola changed discussion status to closed about 1 month ago