--- title: Neochar emoji: 🖼 colorFrom: purple colorTo: red sdk: gradio sdk_version: 5.25.0 app_file: app.py pinned: false license: openrail short_description: Unwritten Chinese Charecters in Style --- # What is this? Generate New Characters by combining parts in creative ways. Write them in a controlled style. - Inspired by - Lin Yutang's [Ming-Kwai typewriter](https://en.wikipedia.org/wiki/Chinese_typewriter#MingKwai_design) - Wu Yue's [Glyffuser](https://yue-here.com/posts/glyffuser/) # Why - Fun to generate valid but unseen characters. (Never in a dictionary, nor Unicode). - Implements Lin Yutang's ideas with generative AI/ML, without the mechanical marvel :-/ or limitations :-) - Extends a font to support new charsets, and beyond to non-existent chars. - Adds variation/diversity/personality to generated images. No boring duplicates from the same char. - Other [Creative Uses](#creative-uses) # How to use this app - Combine components or radicals in the following way - Specify the 'Structure' and 'Components', in a [Polish Notation](https://en.wikipedia.org/wiki/Polish_notation) fashion - Good for tree structures - ⿰: 'LR' Left-Rigth - ⿱: 'TB' Top-Bottom - ⿸: 'TL' Top-Left - ⿹: 'TR' Top-Right - ⿺: 'BL' Bottom-Left - ⿴: 'OI' Outer-Inner - ⿻: 'OV' Overlap - ⿲: 'LMR' Left-Middle-Right - ⿳: 'TMB' Top-Middle-Bottom - ⿵: 'BT' Bottom Open Enclosure - ⿶: 'CT' Top Open Enclosure - ⿷: 'RT' Right Open Enclosure - Select a 'Style' by clicking the sample images - Hit the 'Generate' button - Repeat # Usage Tips - Simple structures work best (⿰ ⿱ ⿴ etc.) - "Known radicals at seen positions" work best (釒on left better than right, but may also surprise you in a good way) - Noto font family (sans and serif) gives the best results, as there are many training examples - Cursive and handwritten styles usually give good results, as they are more tolerant - Fonts supporting less chars are challenging - Current model was trained with 300k samples for only 20 epochs - Training will continue if this app gets attention or likes - For dictionary chars, [decompose](https://github.com/cburgmer/cjklib/blob/master/cjklib/data/characterdecomposition.csv) first. - For a part hard to describe, or you don't care, use a wildcard '?' (full-width question mark, or does it matter?) - What to do when the results are not as expected - Pick a different 'sytle' which may have trained the model better - Try again with a different random seed. This will change the overall structure in an unpredictable way - Try again with a different 'step' number. This will change the local details in a continuous way # Creative Uses ## Turning a bug into a feature When you see a funny result you didn't expect (5 or 3 dots while it should be 4), don't throw it away immediately. - Save the results to confuse/train OCR - 3vade 3vil c3nsorship - Share in discussion. The input text/seed/step will reliably reproduce the result. # Future Features - Typewriter keyboard for hard-to-input radicals, filtered by pinyin prefix - Direct generation from a single char, auto decomposition