LeroyDyer commited on
Commit
24f4af9
·
verified ·
1 Parent(s): 64faffa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md CHANGED
@@ -290,6 +290,44 @@ Ensure the input file exists, and specify the correct output path during decodin
290
  This design is flexible and reusable for various file types, making it a robust solution for encoding and decoding files into Base64.
291
 
292
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
293
  # Prompt Engineering for Training:
294
 
295
  Early training involved embedding large, detailed prompts to improve the model’s depth of response and adaptability.
@@ -522,3 +560,52 @@ Keep the conversation going by always ending with a question to further probe th
522
 
523
 
524
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
290
  This design is flexible and reusable for various file types, making it a robust solution for encoding and decoding files into Base64.
291
 
292
 
293
+ # Converting DataSets:
294
+
295
+
296
+ ```python
297
+
298
+ # Function to convert a PIL Image to a base64 string
299
+ def image_to_base64(image):
300
+ buffered = io.BytesIO()
301
+ image.save(buffered, format="PNG") # Save the image to the buffer in PNG format
302
+ base64_string = base64.b64encode(buffered.getvalue()).decode('utf-8')
303
+ return base64_string
304
+
305
+
306
+ # Define a function to process each example in the dataset
307
+ def process_images_func(examples):
308
+
309
+ texts = examples["text"]
310
+ images = examples["image"] # Assuming the images are in PIL format
311
+
312
+ # Convert each image to base64
313
+ base64_images = [image_to_base64(image) for image in images]
314
+
315
+ # Return the updated examples with base64-encoded images
316
+ return {
317
+ "text": texts,
318
+ "image_base64": base64_images # Adding the Base64 encoded image strings
319
+ }
320
+
321
+ # Load the dataset
322
+ dataset = load_dataset("oroikon/chart_captioning", split="train[:4000]")
323
+
324
+ # Process the dataset by converting images to base64
325
+ processed_dataset = dataset.map(process_images_func, batched=True)
326
+
327
+
328
+
329
+
330
+ ```
331
  # Prompt Engineering for Training:
332
 
333
  Early training involved embedding large, detailed prompts to improve the model’s depth of response and adaptability.
 
560
 
561
 
562
 
563
+ # ADDING EXTRA HEADS :
564
+
565
+
566
+ ## ADD HEAD
567
+
568
+
569
+ # SPEECH-ENCODER-DECODER-MODEL
570
+ ```python
571
+
572
+
573
+ print('Add Audio...')
574
+ #Add Head
575
+ # Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
576
+ _AudioFeatureExtractor = AutoFeatureExtractor.from_pretrained("openai/whisper-small")
577
+ _AudioTokenizer = AutoTokenizer.from_pretrained("openai/whisper-small")
578
+ _SpeechEncoderDecoder = SpeechEncoderDecoderModel.from_encoder_decoder_pretrained("openai/whisper-small","openai/whisper-small")
579
+
580
+ # Add Pad tokems
581
+ _SpeechEncoderDecoder.config.decoder_start_token_id = _AudioTokenizer.cls_token_id
582
+ _SpeechEncoderDecoder.config.pad_token_id = _AudioTokenizer.pad_token_id
583
+ LM_MODEL.SpeechEncoderDecoder = _SpeechEncoderDecoder
584
+ # Add Sub Components
585
+ LM_MODEL.Decoder_AudioTokenizer = _AudioTokenizer
586
+ LM_MODEL.Encoder_AudioFeatureExtractor = _AudioFeatureExtractor
587
+ LM_MODEL
588
+
589
+ ```
590
+
591
+
592
+ # ADD HEAD
593
+ # Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
594
+
595
+ ```python
596
+
597
+ Vmodel = VisionEncoderDecoderModel.from_encoder_decoder_pretrained(
598
+ "google/vit-base-patch16-224-in21k", "LeroyDyer/Mixtral_AI_Tiny"
599
+ )
600
+ _Encoder_ImageProcessor = Vmodel.encoder
601
+ _Decoder_ImageTokenizer = Vmodel.decoder
602
+ _VisionEncoderDecoderModel = Vmodel
603
+ # Add Pad tokems
604
+ LM_MODEL.VisionEncoderDecoder = _VisionEncoderDecoderModel
605
+ # Add Sub Components
606
+ LM_MODEL.Encoder_ImageProcessor = _Encoder_ImageProcessor
607
+ LM_MODEL.Decoder_ImageTokenizer = _Decoder_ImageTokenizer
608
+ LM_MODEL
609
+
610
+
611
+ ```