Text Generation
Transformers
Safetensors
mistral
text-generation-inference
unsloth
Mistral_Star
Mistral_Quiet
Mistral
Mixtral
Question-Answer
Token-Classification
Sequence-Classification
SpydazWeb-AI
chemistry
biology
legal
code
climate
medical
LCARS_AI_StarTrek_Computer
chain-of-thought
tree-of-knowledge
forest-of-thoughts
visual-spacial-sketchpad
alpha-mind
knowledge-graph
entity-detection
encyclopedia
wikipedia
stack-exchange
Reddit
Cyber-series
MegaMind
Cybertron
SpydazWeb
Spydaz
LCARS
star-trek
mega-transformers
Mulit-Mega-Merge
Multi-Lingual
Afro-Centric
African-Model
Ancient-One
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -290,6 +290,44 @@ Ensure the input file exists, and specify the correct output path during decodin
|
|
290 |
This design is flexible and reusable for various file types, making it a robust solution for encoding and decoding files into Base64.
|
291 |
|
292 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
293 |
# Prompt Engineering for Training:
|
294 |
|
295 |
Early training involved embedding large, detailed prompts to improve the model’s depth of response and adaptability.
|
@@ -522,3 +560,52 @@ Keep the conversation going by always ending with a question to further probe th
|
|
522 |
|
523 |
|
524 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
290 |
This design is flexible and reusable for various file types, making it a robust solution for encoding and decoding files into Base64.
|
291 |
|
292 |
|
293 |
+
# Converting DataSets:
|
294 |
+
|
295 |
+
|
296 |
+
```python
|
297 |
+
|
298 |
+
# Function to convert a PIL Image to a base64 string
|
299 |
+
def image_to_base64(image):
|
300 |
+
buffered = io.BytesIO()
|
301 |
+
image.save(buffered, format="PNG") # Save the image to the buffer in PNG format
|
302 |
+
base64_string = base64.b64encode(buffered.getvalue()).decode('utf-8')
|
303 |
+
return base64_string
|
304 |
+
|
305 |
+
|
306 |
+
# Define a function to process each example in the dataset
|
307 |
+
def process_images_func(examples):
|
308 |
+
|
309 |
+
texts = examples["text"]
|
310 |
+
images = examples["image"] # Assuming the images are in PIL format
|
311 |
+
|
312 |
+
# Convert each image to base64
|
313 |
+
base64_images = [image_to_base64(image) for image in images]
|
314 |
+
|
315 |
+
# Return the updated examples with base64-encoded images
|
316 |
+
return {
|
317 |
+
"text": texts,
|
318 |
+
"image_base64": base64_images # Adding the Base64 encoded image strings
|
319 |
+
}
|
320 |
+
|
321 |
+
# Load the dataset
|
322 |
+
dataset = load_dataset("oroikon/chart_captioning", split="train[:4000]")
|
323 |
+
|
324 |
+
# Process the dataset by converting images to base64
|
325 |
+
processed_dataset = dataset.map(process_images_func, batched=True)
|
326 |
+
|
327 |
+
|
328 |
+
|
329 |
+
|
330 |
+
```
|
331 |
# Prompt Engineering for Training:
|
332 |
|
333 |
Early training involved embedding large, detailed prompts to improve the model’s depth of response and adaptability.
|
|
|
560 |
|
561 |
|
562 |
|
563 |
+
# ADDING EXTRA HEADS :
|
564 |
+
|
565 |
+
|
566 |
+
## ADD HEAD
|
567 |
+
|
568 |
+
|
569 |
+
# SPEECH-ENCODER-DECODER-MODEL
|
570 |
+
```python
|
571 |
+
|
572 |
+
|
573 |
+
print('Add Audio...')
|
574 |
+
#Add Head
|
575 |
+
# Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
|
576 |
+
_AudioFeatureExtractor = AutoFeatureExtractor.from_pretrained("openai/whisper-small")
|
577 |
+
_AudioTokenizer = AutoTokenizer.from_pretrained("openai/whisper-small")
|
578 |
+
_SpeechEncoderDecoder = SpeechEncoderDecoderModel.from_encoder_decoder_pretrained("openai/whisper-small","openai/whisper-small")
|
579 |
+
|
580 |
+
# Add Pad tokems
|
581 |
+
_SpeechEncoderDecoder.config.decoder_start_token_id = _AudioTokenizer.cls_token_id
|
582 |
+
_SpeechEncoderDecoder.config.pad_token_id = _AudioTokenizer.pad_token_id
|
583 |
+
LM_MODEL.SpeechEncoderDecoder = _SpeechEncoderDecoder
|
584 |
+
# Add Sub Components
|
585 |
+
LM_MODEL.Decoder_AudioTokenizer = _AudioTokenizer
|
586 |
+
LM_MODEL.Encoder_AudioFeatureExtractor = _AudioFeatureExtractor
|
587 |
+
LM_MODEL
|
588 |
+
|
589 |
+
```
|
590 |
+
|
591 |
+
|
592 |
+
# ADD HEAD
|
593 |
+
# Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
|
594 |
+
|
595 |
+
```python
|
596 |
+
|
597 |
+
Vmodel = VisionEncoderDecoderModel.from_encoder_decoder_pretrained(
|
598 |
+
"google/vit-base-patch16-224-in21k", "LeroyDyer/Mixtral_AI_Tiny"
|
599 |
+
)
|
600 |
+
_Encoder_ImageProcessor = Vmodel.encoder
|
601 |
+
_Decoder_ImageTokenizer = Vmodel.decoder
|
602 |
+
_VisionEncoderDecoderModel = Vmodel
|
603 |
+
# Add Pad tokems
|
604 |
+
LM_MODEL.VisionEncoderDecoder = _VisionEncoderDecoderModel
|
605 |
+
# Add Sub Components
|
606 |
+
LM_MODEL.Encoder_ImageProcessor = _Encoder_ImageProcessor
|
607 |
+
LM_MODEL.Decoder_ImageTokenizer = _Decoder_ImageTokenizer
|
608 |
+
LM_MODEL
|
609 |
+
|
610 |
+
|
611 |
+
```
|