What's your "AI code" looks like for the MoE merging?

#1
by liuliu87 - opened

Just curious and if you share, we might figure out some improvements?

I have uploaded the code I used. Among it, average_moe_lora.py is the code for weighted averaging, while weighted_average_moe_lora.py allows for more flexible weight settings. However, I only used it to generate four LoRAs, each of which set the weight of one expert layer to 1 and all other layers to 0. Unfortunately, the performance of these four LoRAs was quite poor. I haven't tested any other weight combinations.

Sign up or log in to comment