Normal view MARC view ISBD view

Introduction to large language models : (Record no. 240856)

MARC details
000 -LEADER
fixed length control field	13783nam a22002177a 4500
005 - DATE AND TIME OF LATEST TRANSACTION
control field	20260210155124.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	260209b \|\|\|\|\|\|\|\| \|\|\|\| 00\| 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	9789363864740
040 ## - CATALOGING SOURCE
Transcribing agency	AIMIT LIBRARY
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Edition number	1
Classification number	006.3
Item number	CHAT
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name	Chakraborty, Tanmoy.
9 (RLIN)	254137
245 ## - TITLE STATEMENT
Title	Introduction to large language models :
Remainder of title	generative ai for text /
Statement of responsibility, etc.	By Tanmoy Chakraborty.
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Place of publication, distribution, etc.	New Delhi :
Name of publisher, distributor, etc.	Wiley India Pvt Ltd ,
Date of publication, distribution, etc.	2025.
300 ## - PHYSICAL DESCRIPTION
Extent	xxi,461 p.;
Other physical details	PB
Dimensions	24 cm.
500 ## - GENERAL NOTE
General note	Introduction to Large Language Models (LLMs) is a comprehensive guide for understanding the foundations and advancements of Generative AI for Text. Designed for educators and enthusiasts, the book starts with key linguistic concepts and progresses through NLP fundamentals—from word embeddings to pretrained foundational models.<br/><br/> <br/><br/>Readers will learn how LLMs process and generate language, overcome limitations, and enhance performance using techniques like prompt engineering, retrieval-augmented generation, and human alignment. The book uniquely presents cutting-edge research in a concise format, enriched with visual aids, exercises, and practical resources.<br/><br/> <br/><br/>Ideal for computer science faculty, this resource offers both theoretical insights and real-world applications, showcasing how LLMs like ChatGPT are transforming technology and advancing AI innovation.<br/><br/>
505 ## - FORMATTED CONTENTS NOTE
Formatted contents note	Endorsement<br/><br/>Preface<br/><br/>Acknowledgement<br/><br/>Foreword<br/><br/>1 Introduction<br/><br/>1.1 What is a Language Model?<br/><br/>1.2 Evolution of Language Modelling Technologies<br/><br/>1.3 Scaling Laws in Language Models<br/><br/>1.4 Evolution of LLMs<br/><br/>1.4.1 The Emergence and Development of LLMs<br/><br/>1.4.2 Implications of Encoder-Decoder in LLM Development<br/><br/>1.4.3 Optimising Scale and Resource Efficiency in LLMs<br/><br/>1.5 Organisation of the Book<br/><br/>Additional Resources<br/><br/>Bibliography<br/><br/> <br/><br/>2 An Overview of Natural Language Processing and Neural Networks<br/><br/>Part I: Natural Language Processing<br/><br/>2.1 Computational Linguistics and Natural Language Processing<br/><br/>2.2 Overview of the Natural Language Processing Pipeline<br/><br/>2.3 Morphology<br/><br/>2.3.1 Morphemes<br/><br/>2.3.2 Stemming<br/><br/>2.3.3 Lemmatisation<br/><br/>2.3.4 Lexicon<br/><br/>2.4 Tokenisation<br/><br/>2.4.1 Advanced Techniques: Subword Tokenisation<br/><br/>2.5 Syntactics<br/><br/>2.6 Semantics<br/><br/>2.7 Introduction to Language Modelling<br/><br/>Part II: Neural Networks<br/><br/>2.8 The Perceptron<br/><br/>2.8.1 Definition<br/><br/>2.8.2 Implementing AND, OR, and XOR Logic<br/><br/>2.9 Multilayer Perceptron<br/><br/>2.9.1 Neural Networks<br/><br/>2.9.2 Types of Activation Functions<br/><br/>2.10 Training Neural Networks<br/><br/>2.10.1 Backpropagation<br/><br/>2.10.2 Batching<br/><br/>2.10.3 Hyperparameters<br/><br/>2.10.4 Regularisation<br/><br/>2.11 Vanishing and Exploding Gradients<br/><br/>2.12 Evaluation Metrics<br/><br/>2.13 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/>3 Word Embedding<br/><br/>3.1 Distributional Hypothesis<br/><br/>3.2 Vector Semantics<br/><br/>3.2.1 Defining and Measuring Semantic Similarity<br/><br/>3.3 Types of Word Embedding<br/><br/>3.3.1 Frequency-Based Embeddings<br/><br/>3.3.2 Word2Vec<br/><br/>3.3.3 Global Vectors for Word Representation<br/><br/>3.3.4 FastText<br/><br/>3.4 Bias in Word Embedding<br/><br/>3.5 Limitations of Word Embedding Methods<br/><br/>3.6 Applications of Word Embeddings<br/><br/>3.7 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>4 Statistical Language Model<br/><br/>4.1 Statistical Language Model<br/><br/>4.1.1 The Conditional Probability<br/><br/>4.1.2 The Chain Rule of Probability<br/><br/>4.1.3 The Markov Assumption<br/><br/>4.1.4 Unigram Language Model<br/><br/>4.1.5 Bigram Language Model<br/><br/>4.2 Smoothing<br/><br/>4.2.1 The Unknown Tokens<br/><br/>4.2.2 Smoothing<br/><br/>4.2.3 Back-Off<br/><br/>4.2.4 Interpolation<br/><br/>4.2.5 Good-Turing<br/><br/>4.3 Evaluation of Language Model<br/><br/>4.3.1 Extrinsic Evaluation<br/><br/>4.3.2 Intrinsic Evaluation<br/><br/>4.3.3 Human Evaluation<br/><br/>4.3.4 Evaluation Metrics<br/><br/>4.3.5 Benchmark Suits<br/><br/>4.4 Limitations of Statistical Language Models<br/><br/>4.5 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/>5 Neural Language Models<br/><br/>5.1 Convolutional Neural Networks<br/><br/>5.1.1 Components of CNNs: Kernel, Stride, Pooling, and Padding<br/><br/>5.1.2 Hierarchical and Dilated Convolutions<br/><br/>5.1.3 Applications of CNNs in NLP<br/><br/>5.2 Recurrent Neural Networks<br/><br/>5.2.1 Training RNNs<br/><br/>5.2.2 Applications of RNNs<br/><br/>5.2.3 Challenges in Sequence Modelling<br/><br/>5.2.4 RNN Variants: LSTM, GRU, and Bidirectional RNNs<br/><br/>5.3 Sequence-to-Sequence Models<br/><br/>5.3.1 Training Sequence-to-Sequence Models<br/><br/>5.3.2 Inference Decoding<br/><br/>5.3.3 Applications of Sequence-to-Sequence Models<br/><br/>5.4 Attention Mechanisms<br/><br/>5.4.1 Introduction to Attention<br/><br/>5.4.2 Advantages of Attention<br/><br/>5.4.3 Variants of Attention<br/><br/>5.5 Limitations of Neural Language Models<br/><br/>5.6 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>6 Transformers<br/><br/>6.1 Self-Attention<br/><br/>6.1.1 Multi-Head Self-Attention<br/><br/>6.2 Transformer Encoder Block<br/><br/>6.2.1 Components of the Transformer Encoder Block<br/><br/>6.2.2 Feed-Forward Neural Network<br/><br/>6.2.3 Layer Normalisation<br/><br/>6.2.4 Residual Connections<br/><br/>6.3 Transformer Decoder Block<br/><br/>6.3.1 Masked Multi-Head Self-Attention<br/><br/>6.3.2 Cross-Attention (Encoder-Decoder Attention)<br/><br/>6.4 Positional Embeddings<br/><br/>6.4.1 Types of Positional Embeddings<br/><br/>6.4.2 Rotary Position Embedding<br/><br/>6.5 Efficient Attention Mechanisms<br/><br/>6.5.1 KV Caching in Multi-Head Self-Attention<br/><br/>6.5.2 Multi-Query Attention<br/><br/>6.5.3 Grouped-Query Attention<br/><br/>6.5.4 Sliding Window Attention<br/><br/>6.6 An Alternate Formulation of Transformers<br/><br/>6.6.1 Residual Stream Perspective of Transformers<br/><br/>6.6.2 Attention Heads: Reading and Writing<br/><br/>6.6.3 Feed-Forward Networks: Transformation of Residual Streams<br/><br/>6.6.4 Prediction Head: Generating the Next Token<br/><br/>6.6.5 Decomposing the Transformer: Attention and Feed-Forward Contributions<br/><br/>6.6.6 Residual Networks as Shallow Ensembles<br/><br/>6.6.7 Interpreting the Mechanism of LLMs<br/><br/>6.7 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>7 Language Model Pretraining<br/><br/>7.1 Embeddings from Language Model<br/><br/>7.1.1 Architecture and Training of ELMo<br/><br/>7.1.2 Applications of ELMo<br/><br/>7.1.3 Limitations of ELMo<br/><br/>7.2 Evaluation Datasets<br/><br/>7.3 Encoder-Based Pretraining<br/><br/>7.3.1 Fundamentals of Encoder-Based Models<br/><br/>7.3.2 Training Paradigm<br/><br/>7.3.3 BERT Pretraining<br/><br/>7.3.4 Applications and Limitations<br/><br/>7.4 Decoder-Based Pretraining<br/><br/>7.4.1 Decoder-Based Architecture<br/><br/>7.4.2 Training Paradigm<br/><br/>7.4.3 GPT Pretraining<br/><br/>7.4.4 Applications and Limitations<br/><br/>7.5 Encoder-Decoder Based Pretraining<br/><br/>7.5.1 Architecture<br/><br/>7.5.2 Joint Pretraining Strategy<br/><br/>7.5.3 T5 Pretraining<br/><br/>7.5.4 Applications and Limitations<br/><br/>7.6 Emergence of Large Language Models<br/><br/>7.7 Limitations of Pretraining<br/><br/>7.8 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>8 Fine-Tuning and Alignment of LLMs<br/><br/>8.1 Moving from Pretraining to Fine-Tuning<br/><br/>8.2 Fine-Tuning on Various Task-Specific Applications<br/><br/>8.2.1 Sequence Classification<br/><br/>8.2.2 Pairwise Sequence Classification<br/><br/>8.2.3 Sequence Labelling<br/><br/>8.2.4 Learning Spans<br/><br/>8.2.5 Challenges in Classical Fine-Tuning Methods<br/><br/>8.3 Instruction Tuning<br/><br/>8.4 Alignment Methods<br/><br/>8.4.1 Reinforcement Learning from Human Feedback<br/><br/>8.4.2 Direct Preference Optimisation<br/><br/>8.5 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>9 Prompting Strategies in LLMs<br/><br/>9.1 Prompt Engineering<br/><br/>9.1.1 Prompt Shape<br/><br/>9.1.2 Manual Template Engineering<br/><br/>9.1.3 Automated Template Learning<br/><br/>9.1.4 Continuous Prompts<br/><br/>9.2 Prompt Application<br/><br/>9.2.1 In-Context Learning<br/><br/>9.2.2 Knowledge Probing<br/><br/>9.2.3 Classification-Based Tasks<br/><br/>9.2.4 Information Extraction<br/><br/>9.2.5 Reasoning in Natural Language Processing<br/><br/>9.2.6 Question Answering<br/><br/>9.2.7 Text Generation<br/><br/>9.2.8 Automatic Evaluation of Text Generation<br/><br/>9.3 Chain-of-Thoughts<br/><br/>9.4 Tree-of-Thoughts<br/><br/>9.5 Graph-of-Thoughts<br/><br/>9.6 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>10 Efficient Methods for Fine-Tuning LLMs<br/><br/>10.1 Model Compression with Knowledge Distillation<br/><br/>10.1.1 White-Box Knowledge Distillation<br/><br/>10.1.2 Meta Knowledge Distillation<br/><br/>10.1.3 Black-Box Knowledge Distillation<br/><br/>10.2 Model Compression Techniques<br/><br/>10.2.1 Model Pruning<br/><br/>10.2.2 Model Quantisation<br/><br/>10.3 Parameter-Efficient Fine-Tuning<br/><br/>10.3.1 Adapters<br/><br/>10.3.2 Prefix Tuning<br/><br/>10.3.3 Prompt Tuning<br/><br/>10.3.4 Selective PEFT Techniques<br/><br/>10.3.5 Reparameterisation-Based PEFT Techniques<br/><br/>10.3.6 Hybrid Approaches for Efficient Fine-Tuning<br/><br/>10.4 Efficient Strategies for Fine-Tuning LLMs<br/><br/>10.4.1 Mixed-Precision Tuning<br/><br/>10.4.2 Data Selection for Efficient Fine-Tuning<br/><br/>10.4.3 Prompt Compression<br/><br/>10.5 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>11 Augmented Large Language Models<br/><br/>11.1 Retrieval-Augmented Generation<br/><br/>11.1.1 Indexing in RAGs<br/><br/>11.1.2 Context Searching in RAGs<br/><br/>11.1.3 Prompting in RAGs<br/><br/>11.1.4 Inferencing in RAGs<br/><br/>11.1.5 Comparison of RAGs with LLMs<br/><br/>11.2 Evaluation of RAGs<br/><br/>11.2.1 Assessing of Retrieval Quality<br/><br/>11.2.2 Generation Quality<br/><br/>11.2.3 Knowledge Integration and Factuality Evaluation<br/><br/>11.2.4 Response Time and Efficiency<br/><br/>11.2.5 User Satisfaction<br/><br/>11.2.6 RAGAs Framework for RAG Evaluation<br/><br/>11.3 Tool Calling with LLMs<br/><br/>11.3.1 Autonomously Determining Which Tools to Use and Where<br/><br/>11.3.2 Examples of Different Tools<br/><br/>11.3.3 Evaluation of Code Generation Capabilities of Agents<br/><br/>11.3.4 Error Handling and Optimisation<br/><br/>11.4 LLM Augmentation with Agents<br/><br/>11.4.1 Reasoning in LLM Agents<br/><br/>11.4.2 Planning in LLM Agents<br/><br/>11.4.3 Handling Memory in LLM Agents<br/><br/>11.5 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>12 Multilingual and Multimodal LLMs<br/><br/>12.1 Multilingual Language Models<br/><br/>12.1.1 The Evolution of Multilingual NLP<br/><br/>12.1.2 The Need for Multilingual LLMs<br/><br/>12.1.3 Cross-Lingual Representation Learning<br/><br/>12.1.4 Applications<br/><br/>12.2 Multimodal Language Models<br/><br/>12.2.1 Integration of Diverse Modalities<br/><br/>12.2.2 Applications<br/><br/>12.3 Training Multilingual and Multimodal LLMs<br/><br/>12.3.1 Efficient Data Collection and Preprocessing<br/><br/>12.3.2 Model Training Strategies<br/><br/>12.4 Addressing Challenges in Multilingual and Multimodal LLMs<br/><br/>12.4.1 Challenges in Multilingual LLMs<br/><br/>12.4.2 Challenges in Multimodal LLMs<br/><br/>12.5 Future Directions and Emerging Trends<br/><br/>12.6 Limitations of Multilingual and Multimodal LLMs<br/><br/>12.7 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>13 Responsible LLMs<br/><br/>13.1 Inaccurate, Inappropriate, and Unethical Behaviour of LLMs<br/><br/>13.2 Responsible AI<br/><br/>13.3 Bias<br/><br/>13.3.1 Visibility of Bias<br/><br/>13.3.2 Source of Bias<br/><br/>13.4 Bias Mitigation<br/><br/>13.5 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>14 Advanced Topics in Large Language Models<br/><br/>14.1 Reasoning with LLMs<br/><br/>14.1.1 Advancements in Reasoning Capabilities<br/><br/>14.1.2 Challenges in Reasoning with LLMs<br/><br/>14.1.3 Types of Reasoning Tasks<br/><br/>14.1.4 How Do LLMs Approach Reasoning?<br/><br/>14.1.5 Evaluating Reasoning Abilities in LLMs<br/><br/>14.2 Handling Long Context in LLMs<br/><br/>14.2.1 Challenges in Processing Long Context<br/><br/>14.2.2 Training and Fine-Tuning Approaches to Extend Context Length<br/><br/>14.2.3 Evaluation of Long-Context LLMs<br/><br/>14.3 Model Editing<br/><br/>14.3.1 Conditions for Successful Editing<br/><br/>14.3.2 Methods for Model Editing<br/><br/>14.3.3 Metrics for Evaluation of Model Editing<br/><br/>14.4 Hallucination in LLMs<br/><br/>14.4.1 Definition<br/><br/>14.4.2 Sources of Hallucination<br/><br/>14.4.3 Metrics Measuring Hallucination<br/><br/>14.4.4 Hallucination Mitigation<br/><br/>14.5 Self-Evolving LLMs<br/><br/>14.5.1 Conceptual Framework<br/><br/>14.5.2 Evolution Objectives and Techniques<br/><br/>14.5.3 Challenges<br/><br/>14.6 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/> <br/><br/>15 LLMs in Action<br/><br/>15.1 An Overview of the Landscape<br/><br/>15.1.1 Tracing the Evolution and Importance of LLMs in Contemporary AI<br/><br/>15.1.2 Open-Source vs Closed-Source Paradigms: Benefits and Trade-offs<br/><br/>15.2 A Panoramic View of LLMs<br/><br/>15.2.1 General-Purpose Large Language Models<br/><br/>15.2.2 Language-Specific LLMs<br/><br/>15.2.3 Domain-Specific LLMs<br/><br/>15.2.4 Task-Specific LLMs<br/><br/>15.3 Diverse Applications of LLMs<br/><br/>15.3.1 Healthcare: Enhancing Diagnostics and Patient Care<br/><br/>15.3.2 Finance: Transforming Data Analysis and Risk Management<br/><br/>15.3.3 Legal: Streamlining Research and Case Management<br/><br/>15.3.4 Education: Personalised Learning and Academic Support<br/><br/>15.4 Emerging Trends and Future Directions in LLMs<br/><br/>15.4.1 Beyond Text: The Advent of Multimodal LLMs<br/><br/>15.4.2 Autonomous Agents: The LLM Leap in AI Evolution (AutoGPT)<br/><br/>15.5 Summary<br/><br/>Additional Resources<br/><br/>Exercises<br/><br/>Bibliography<br/><br/>Index
Statement of responsibility	Dr. Tanmoy Chakraborty is an Associate Professor in the Department of Electrical Engineering at IIT Delhi and an Associate Faculty Member at the Yardi School of Artificial Intelligence. An ACM Distinguished Speaker (2023–2025) and former Ramanujan Fellow (2018–2023), he has held key academic roles, including heading the Infosys Centre for Artificial Intelligence at IIIT Delhi.<br/><br/>Dr. Chakraborty earned his Ph.D. as a Google India scholar at IIT Kharagpur and completed a postdoctoral fellowship at the University of Maryland, College Park. His research spans Natural Language Processing (NLP), Graph Neural Networks, and Social Computing, with a focus on creating frugal, explainable LLMs for applications in mental health and cyber-informatics.<br/><br/>He leads the Laboratory for Computational Social Systems (LCS2) and also recipient of multiple faculty awards from Google, Adobe, and Accenture,
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Word embedding
9 (RLIN)	254138
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Language model pretraining
9 (RLIN)	254139
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Prompting strategies in LLMs
9 (RLIN)	254140
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme	Dewey Decimal Classification
Koha item type	Book
Edition	1
Call number prefix	006.3 CHAT

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Damaged status	Not for loan	Collection code	Home library	Current library	Shelving location	Date acquired	Cost, normal purchase price	Inventory number	Total Checkouts	Full call number	Barcode	Date last seen	Cost, replacement price	Price effective from	Koha item type
		Dewey Decimal Classification			MCA	St Aloysius Institute of Management & Information Technology	St Aloysius Institute of Management & Information Technology	Artificial intelligence	02/03/2026	845.00	Bill.no:1288 Bill.dt:2026/01/23		006.3 CHAT	MCA17367	05/23/2026	633.75	02/09/2026	Book