I used my Poco X6 Camera phone and solo taken images
My dataset is far from being ready, thus I have used so many repeating and almost same images, but this was rather experimental
Hopefully I will continue taking more shots and improve dataset and reduce size in future
I trained Clip-L and T5-XXL Text Encoders as well
Since there was too much push from community that my workflow won’t work with expressions, I had to take a break from research and use whatever I have
I used my own researched workflow for training with Kohya GUI and also my own self developed SUPIR app batch upscaling with face upscaling and auto LLaVA captioning improvement
Download images to see them in full size, the last provided grid is 50% downscaled
Workflow
Gather a dataset that has expressions and perspectives that you like after training, this is crucial, whatever you add, it can generate perfect
Follow one of the LoRA training tutorials / guides
After training your LoRA, use your favorite UI to generate images
I prefer SwarmUI and here used prompts (you can add specific expressions to prompts) including face inpainting :
Finally tried Kotaemon, an open-source RAG tool for document chat!
With local models, it's free and private. Perfect for journalists and researchers.
I put Kotaemon to the test with EPA's Greenhouse Gas Inventory. Accurately answered questions on CO2 percentage in 2022 emissions and compared 2022 vs 2021 data
🛠️ Kotaemon's no-code interface makes it user-friendly. - Use your own models or APIs from OpenAI or Cohere - Great documentation & easy installation - Multimodal capabilities + reranking - View sources, navigate docs & create graphRAG
🌟 Kotaemon is gaining traction with 11.3k GitHub stars
Distilabel and synthetic data community interviews - the outcomes
We've been doing some interview with community members to understand the needs surrounding synthetic data. Many thanks to the participants. Note that, given they interviewees were sourced from our community, so the results will likely represent that.
Things distilabel does well - security and reliability by caching generations and having serializable pipelines. - scaling up generation by parallelising inference and Anyscale Ray - solid implementations of state of the art research papers
Things to improve - communication about the fact we support structured generation - customization of existing prompt implementations are difficult - creation of new tasks prove difficult - arguments and parameters for tasks aren't available at first glance - the learning curve can be steep - more tutorials that represent real-life usage
Things to note - create small scale and large scale dataset to Millions of records - people use synthetic data to move away from frontier model providers - people mostly use 7B or 70B models for generating
1. **Overview** "EveryText" is at the forefront of AI image generation, offering a novel "TBF ('Text by Font') Image Model" that enables the representation of all languages globally in AI-generated images without prior training.
2. **Background** Platforms like MidJourneyV6 and FLUX have advanced AI image generation, typically supporting English text. Alibaba Group expanded this to include Chinese, Japanese, and Korean, signaling a shift towards global language support.
3. **Challenges** Existing methods faced several challenges including the need for additional editing, dependency on specific training, and substantial resource requirements. These approaches also struggled with limited vocabulary and were primarily effective only for English.
4. **Innovative Solution** EveryText utilizes "Fonts" as pre-trained models, allowing any text to be visually represented without traditional training. This approach not only enhances diversity and aesthetics by utilizing various fonts but also ensures unlimited expression.
5. **Using the Service** EveryText is free and easy to use: - **Prompt**: Describe the image. - **Text for Image Generation**: Add your text. - **Text Position and Size**: Customize the text's placement and size. - **Font Selection**: Optionally select a font. - **Advanced Settings**: Further refine the image creation. - Click "START" to generate the image.
6. **Comparative Analysis** EveryText supports all languages with superior image quality and text legibility, setting it apart from platforms like MidJourneyV6/Flux and AnyText by Alibaba Group.
7. **Conclusion** EveryText has revolutionized AI-generated imagery by integrating all global languages, broadening the scope for creative and communicative applications. Its future potential is vast and promising.