### DAY 1. Introduction to AI, ML, and LLMs
#### Machine Learning Flow
1. Training dataset
2. Algorithm/program execution
3. Pattern/rule extraction
Tasks:
- Object detection
- Value prediction
- Language tasks
- Face recognition
#### Language Models
- Next-token prediction
- Probability over continuations
- Token-by-token generation
#### Core Concepts
- Turing Test
- Maximum-likelihood estimation
- Count-and-divide baseline
- Markov assumption
- n-gram (unigram, bigram, n-gram)
- Neural language model
---
### DAY 2. Basics of Machine Learning
#### Learning Types
- Supervised learning
- Unsupervised learning
- Reinforcement learning
#### Data and Features
- Data matrix
- Feature vector
- Histogram features
- Text features
- Image features
#### Text Representation
- Term-document matrix
- Bag of words
- tf-idf
- Neural embeddings
#### Vision Representation
- Convolution
- Kernel/filter
- Feature map
- 3x3 kernel
- Smoothing
#### Prediction Tasks
- Linear regression
- Logistic regression
- Multiclass one-vs-all
- Clustering
- K-means
#### Validation and Generalization
- Train/validation/test split
- Feature selection split
- Leave-one-out cross-validation
- Bias-variance tradeoff
- Overfitting
- Learning curves
#### Evaluation Metrics
- Accuracy
- Confusion matrix
- Precision
- Recall
- F1-score
---
### DAY 3. Data Preprocessing for NLP
#### NLP Tools
- Gensim
- spaCy
- IBM Watson
- MonkeyLearn
- TextBlob
- Stanford CoreNLP
- Google Cloud Natural Language API
- NLTK
#### NLTK Basics
- Package/module hierarchy
- Data-oriented classes
- Task-oriented classes
- token/probability/tree/CFG/tagger/parser/classifier/corpus modules
#### Token and Corpus Concepts
- Tokenization
- Word token vs word type
- Corpora (raw/annotated)
- Corpus access (raw text, words, sentences)
#### Preprocessing Steps
- Remove punctuation
- Lowercase
- Remove numbers
- Remove stop words
- Stop-word handling with scikit-learn
#### Morphology and Tagging
- Stemming
- Lemmatization
- POS tagging
- Tagset
- Default/regex/unigram/n-gram tagger
#### Modeling Bridge
- Shallow classification pipeline
- Deep classification pipeline
- Perceptron
- Multi-layer perceptron
- Backpropagation
- Computation graph
#### Neural Network Basics
- Capacity vs hidden size
- Regularization (weight penalty)
- Multi-class output
- Softmax
- Cross-entropy / log-likelihood
---
### DAY 4. Introduction to LLMs and Their Applications
#### Transformer Basics
- LLMs built on Transformers
- Attention-centered architecture
- Residual stream
- Feed-forward layer
- Layer normalization
#### Embeddings
- Static embeddings (word2vec)
- Contextual embeddings
- Context-dependent token meaning
#### Attention Mechanism
- Weighted context integration
- Dot-product similarity
- Left-to-right autoregressive attention
- Multi-head attention
#### Matrix Formulation
- Input matrix X [N x d]
- Q, K, V projections
- QK^T scores
- Scaling + softmax + V multiplication
- Parallel token computation
#### Generation Constraints
- Causal masking
- No future-token access
- O(N^2) attention complexity
#### Input/Output Heads
- Token embeddings
- Positional embeddings
- Token + position sum
- Language modeling head
#### Tokenization and Ecosystem
- BPE
- Hugging Face Transformers
- Hugging Face Hub
- Datasets
- Tokenizers
- AutoTokenizer / AutoModel
- Trainer
---
### DAY 5. Supervised Learning for NLP Tasks
#### Sentiment Analysis Scope
- Polarity classification
- Positive/Negative/Neutral labels
- Movie review example
#### Sentiment Components
- Holder (source)
- Target (aspect/entity)
- Attitude type
- Polarity and strength
- Sentence/document scope
#### Task Variants
- Binary polarity
- 1-5 rating
- Target/source/attitude extraction
#### Baseline Pipeline
- IMDB polarity setup
- Pang & Lee baseline line
- Feature extraction + classifier
#### Tokenization Issues
- HTML/XML markup
- Twitter markup
- Capitalization
- Dates/phone numbers
- Emoticons
#### Feature Engineering
- Adjectives vs all words
- Negation handling with `NOT_` prefix
#### Naive Bayes
- Label Y from text features X
- Prior / Likelihood / Posterior / Marginal
- Conditional independence assumption
- Multinomial NB
- Boolean Multinomial NB
- Normal vs Boolean comparison
#### Hard Cases
- Subtle wording
- Mixed sentiment
- Domain-specific language
#### LLM Classification with LangChain
- Zero-shot / few-shot classification
- PromptTemplate
- OutputParser
- LLMChain
- Runnable composition
---
### DAY 6. Advanced NLP Techniques
#### Sequence Modeling
- RNN
- Language modeling with RNN
- Sampling with language model
- Seq2Seq
- Conditional language modeling
#### GPT and BERT
- GPT (autoregressive)
- Attention masking
- Teacher forcing
- Generation
- BERT (bidirectional)
- Masked language modeling
- Transfer learning
- Fine-tuning
- BERT vs GPT
#### Embedding Topics
- Pre-LLM vs post-LLM embedding era
- Polysemy issue
- Vector space models
- Relatedness
- Word analogy
- Dense embedding learning
- Embedding layer (`torch.nn.Embedding`)
#### Reinforcement Learning Topics
- RL basics
- Agent/environment/reward
- Exploration vs exploitation
- Multi-armed bandit
- Greedy
- Epsilon-greedy
- 10-armed testbed
- Deep RL
---
### DAY 7. Language Generation with LLMs
#### RLHF Core
- Reinforcement Learning from Human Feedback
- Alignment-oriented generation
- Preference-based optimization
#### RLHF Pipeline
- Instruction-tuned base model
- Human feedback interface
- Preference/reward model
- RL fine-tuning
#### RL Optimization
- PPO
- Reward model signal
- KL penalty
- Combined rewards
- Feedback-training loop
#### Evaluation
- Human evaluation
- LLM-as-a-judge
- Leaderboards
#### Alignment and SFT
- Superficial Alignment Hypothesis
- InstructGPT (2022)
- Claude (2022)
- Llama 2 (2023)
- Supervised Fine-Tuning (SFT)
- Annotation scale and quality
---
### DAY 8. Ethical Considerations and Challenges in LLMs
#### Ethics Scope
- Ethics of AI/LLMs
- Dual-use risk
- Risk amplification
Examples:
- Richard Handl's Kitchen Reactor (2011)
- Mousepox experiment (2001)
#### Ethics Foundations
- What is ethics
- Ethical theories
- Trolley-style cases
#### Bias and Fairness
- Hiring-related bias cases
- COMPAS case
- Statistical bias
- Fairness perspectives
- Group fairness
#### Bias in LLMs
- Data bias
- Annotation/preference bias
- Objective/optimization bias
- Deployment feedback loops
- Synthetic data for bias discovery
#### Ethical Issue Categories
- Human agency and oversight
- Technical robustness and safety
- Privacy and data governance
- Transparency
- Diversity / non-discrimination / fairness
- Societal and environmental well-being
- Accountability
#### Security and Mitigation
- Prompt injection
- Fairness metrics
- Bias detection/mitigation tools
- MBIAS
- Governance controls
---
### DAY 9. Model Deployment and Integration in Applications
#### Reasoning Prompting
- Chain-of-Thought prompting
- Few-shot prompting
- Arithmetic reasoning results
- Ablation studies
- Robustness checks
- Common-sense and symbolic reasoning
#### Deployment Decisions
- API-based deployment
- Self-hosted deployment
- API vs self-host tradeoff
- OpenAI-compatible API importance
#### Inference Optimization
- Inference optimization
- Pricing model
- Token speed vs throughput
- Test-time compute scaling
- Prefix caching
- Prefix caching best practices
#### MLOps and LLMOps
- Typical ML pipeline
- Deployment gap
- MLOps principles
- Continuous workflows
- Version everything
- Automation
- Testing
- Reusability
#### Lifecycle and Operations
- MLOps tools
- LLM lifecycle in production
- Monitoring
- Versioning and rollback
- Cost/latency/throughput tracking
- Security and governance