What's your "AI code" looks like for the MoE merging?

by liuliu87 - opened 13 days ago

Discussion

liuliu87

13 days ago

Just curious and if you share, we might figure out some improvements?

aha2023

Owner 13 days ago

I have uploaded the code I used. Among it, average_moe_lora.py is the code for weighted averaging, while weighted_average_moe_lora.py allows for more flexible weight settings. However, I only used it to generate four LoRAs, each of which set the weight of one expert layer to 1 and all other layers to 0. Unfortunately, the performance of these four LoRAs was quite poor. I haven't tested any other weight combinations.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment