What's your "AI code" looks like for the MoE merging?
#1
by
liuliu87
- opened
Just curious and if you share, we might figure out some improvements?
I have uploaded the code I used. Among it, average_moe_lora.py
is the code for weighted averaging, while weighted_average_moe_lora.py
allows for more flexible weight settings. However, I only used it to generate four LoRAs, each of which set the weight of one expert layer to 1 and all other layers to 0. Unfortunately, the performance of these four LoRAs was quite poor. I haven't tested any other weight combinations.