The following are technical highlights of several models currently accessible through the PIN AI app and used for Personal AI applications within the PIN Network. While each model has its own design goals—such as minimal memory footprint, efficient quantization, or specialized domain training—they all share the common goal of delivering substantial language understanding and generation capabilities in computationally constrained environments.
As AI evolves, we will integrate the latest models to enhance the performance of your Personal AI.
Parameter count & architecture: A scaled-down variant of LLaMA with 500M–1B parameters, retaining transformer architecture with factorized attention.
Quantization & memory footprint: Packaged in 4-bit or 8-bit quantized versions, allowing operation on consumer-grade GPUs and high-end CPUs with limited VRAM/RAM.
Training data & domain focus: Trained on curated text including emails, short-form social media, and chat logs, enabling efficient chat-based applications.
Use case scenarios: Ideal for interactive tasks like text completion, summarization, and personal reminders where latency and memory are critical.