Artificial Intelligence (AI)
Systems that perform tasks requiring human-like perception, reasoning, or language.
Tech Insights
Essential notes on AGI, ASI, NVIDIA, Cerebras, Intel, Etched, Apple MLX, deep learning, and the vocabulary shaping enterprise AI strategy.
Filter
Systems that perform tasks requiring human-like perception, reasoning, or language.
Artificial General Intelligence—models with broad, human-level capability across domains.
Artificial Superintelligence—hypothetical AI surpassing human performance in virtually all areas.
Neural networks with many layers that learn hierarchical representations from data.
Algorithms that improve through experience without explicit rule programming.
Transformer-scale models trained on vast text for generation and understanding.
Attention-based architecture powering modern LLMs and multimodal systems.
Interconnected layers of nodes that approximate complex functions via training.
Retrieval-Augmented Generation—grounding LLM answers in external knowledge bases.
Reinforcement Learning from Human Feedback—aligning models with human preferences.
Adapting a pre-trained model to a specific task or domain with additional data.
Running a trained model to produce predictions or generated content.
Optimizing model weights on datasets—compute-intensive pre-deployment phase.
Models processing text, images, audio, and video in unified architectures.
Dense vector representations capturing semantic meaning for search and clustering.
Dominant GPU supplier for AI—CUDA ecosystem, H100, and Blackwell data-center chips.
NVIDIA's parallel computing platform—the de facto standard for GPU ML workloads.
NVIDIA Hopper GPU—workhorse for large-scale LLM training and inference clusters.
NVIDIA's next-gen architecture targeting trillion-parameter-class AI infrastructure.
Wafer-scale AI chips (WSE)—dinner-plate-sized processors for massive models.
Cerebras Wafer Scale Engine—single wafer die with millions of cores for AI.
CPU and accelerator vendor—Gaudi AI chips and data-center Xeon platforms.
Intel's AI accelerator line designed for training and inference efficiency.
AI chip startup building transformer-specific ASICs (Sohu) for inference at scale.
Application-Specific Integrated Circuit—custom silicon optimized for one workload.
Graphics Processing Unit—parallel processors essential for deep learning throughput.
Google's Tensor Processing Unit—ASIC family built for TensorFlow and JAX workloads.
Apple's array framework for efficient ML on Apple Silicon—NumPy-like API on M-series chips.
Meta's dynamic deep-learning framework widely used in research and production.
Google's end-to-end ML platform for training, deployment, and edge inference.
Google's composable transformations for high-performance numerical computing and ML.
Open Neural Network Exchange—interoperable format for moving models across runtimes.
High-throughput LLM inference engine with PagedAttention for serving at scale.
Orchestration framework for chaining LLMs, tools, and retrieval pipelines.
Hub and libraries for open models, datasets, and transformers ecosystem.
Frontier lab behind GPT models—APIs, ChatGPT, and enterprise AI partnerships.
AI safety–focused lab—Claude models and constitutional AI research.
OpenAI's conversational product—GPT-4 class models via chat interface.
Google DeepMind multimodal LLM family—text, code, image, and video.
Alibaba's open-weight LLM series—strong multilingual and coding performance.
Chinese AI lab—efficient MoE models and competitive reasoning benchmarks.
Large pre-trained model adaptable via fine-tuning to many downstream tasks.
Crafting inputs to steer LLM behavior, quality, and reliability.
Research and practices to reduce harm, misuse, and misalignment in AI systems.
Ensuring AI systems pursue intended goals and human values.
When models generate plausible but false or unsupported information.
Subword unit of text—LLMs process and bill usage by token count.
Maximum tokens a model can consider in a single request or conversation.
Reducing model precision (e.g. INT8/FP4) to cut memory and speed inference.
Sparse architectures activating subsets of parameters per input for scale.
AI-generated training data to augment or replace scarce real-world datasets.
Running models on devices (phones, IoT) rather than centralized cloud only.
Operational practices for deploying, monitoring, and versioning ML in production.
EU regulatory framework governing high-risk and general-purpose AI systems.