> whoami

🖥️ I am a software development engineer.

🔭 I am currently working hard for a living.

🌱 I am currently learning how machine thinks.

📫 How to reach me: luo[at]jiahai.co

Read more…

My First Post

Introduction This is bold text, and this is emphasized text. Visit the Hugo website! This post is pinned for nothing. Sorry for wasting your time. Please read other posts.

luojiahai · Jun 8, 2024

🤗 Hugging Face 🧨 Diffusers: Basics

🧨 Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. DiffusionPipeline from diffusers import DiffusionPipeline pipeline = DiffusionPipeline.from_pretrained('runwayml/stable-diffusion-v1-5', use_safetensors=True) prompt = 'An image of a squirrel in Picasso style' image = pipeline(prompt).images[0] image.save('image_of_squirrel_painting.png') Swapping schedulers from diffusers import EulerDiscreteScheduler pipeline = DiffusionPipeline.from_pretrained('runwayml/stable-diffusion-v1-5', use_safetensors=True) pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config) Models Load the UNet2DModel, a basic unconditional image generation model with a checkpoint trained on cat images:...

luojiahai · Jul 18, 2024

🤗 Hugging Face Transformers: Fine-Tuning

Fine-tuning lets you get more out of the models by providing: Higher quality results than prompting Ability to train on more examples than can fit in a prompt Token savings due to shorter prompts Lower latency requests In this post, we will fine-tune a pretrained model bert-base-uncased for sentence classification as an example. Processing the data We will use the MRPC (Microsoft Research Paraphrase Corpus) dataset, introduced in a paper. The dataset consists of 5,801 pairs of sentences, with a label indicating if they are paraphrases or not (i....

luojiahai · Jun 28, 2024

📝 Natural Language Processing: Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. The following diagram shows the conceptual flow of using RAG with LLMs. Components Indexing A pipeline for ingesting data from a source and indexing it. This usually happens offline. Load: First we need to load our data. Split: We need to break large documents into smaller chunks....

luojiahai · Jun 27, 2024

🤗 Hugging Face: Model Inference

Inference is the process of using a pretrained model to generate outputs on new data. Hugging Face provides different ways to run inference on pretrained models. Hugging Chat Hugging Chat is an open-source interface enabling everyone to try open-source large language models. Inference API Inference API is free to use, and rate limited. You can test and evaluate publicly accessible machine learning models, or your own private models, via simple HTTP requests, with fast inference hosted on Hugging Face shared infrastructure....

luojiahai · Jun 20, 2024