Fundamentals of data science : (Record no. 240863)

MARC details
000 -LEADER
fixed length control field 11883nam a22002537a 4500
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20260210150604.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 260210b |||||||| |||| 00| 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9789363860759
040 ## - CATALOGING SOURCE
Transcribing agency AIMIT LIBRARY
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Edition number 1
Classification number 005.76
Item number LARC
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name Larose, Chantal D.
9 (RLIN) 254221
245 ## - TITLE STATEMENT
Title Fundamentals of data science :
Remainder of title using python and R /
Statement of responsibility, etc. By Chantal D Larose, Daniel T Larose and Shaukat Ali Shahee.
250 ## - EDITION STATEMENT
Edition statement 1st ed.
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Place of publication, distribution, etc. New Delhi :
Name of publisher, distributor, etc. Wiley ,
Date of publication, distribution, etc. 2025.
300 ## - PHYSICAL DESCRIPTION
Extent xx, 256p. ;
Other physical details PB
Dimensions 24.9 cm
500 ## - GENERAL NOTE
General note Fundamentals of Data Science Using Python and R is an essential resource for students and professionals eager to explore data science with Python and R, two of the most popular open-source tools. The book covers the entire Data Science Methodology—from problem understanding to model deployment—and has been widely praised for its clarity and practicality. This adapted edition retains the core structure of the original, while enhancing end-of-chapter questions to suit the Indian academic environment. New examples and exercises focus on India-specific datasets, encouraging students to apply their knowledge to real-world scenarios relevant to India’s socio-economic and technological contexts. This hands-on approach ensures students gain both theoretical understanding and practical skills for a data-driven world.
505 ## - FORMATTED CONTENTS NOTE
Formatted contents note PREFACE TO THE ADAPTED EDITION<br/><br/>PREFACE TO THE US EDITION<br/><br/>ACKNOWLEDGMENTS<br/><br/>ABOUT THE AUTHORS<br/><br/> <br/><br/>CHAPTER 1 INTRODUCTION TO DATA SCIENCE<br/><br/>1.1 Why Data Science?<br/><br/>1.2 What Is Data Science?<br/><br/>1.3 The Data Science Methodology<br/><br/>1.4 Data Science Tasks<br/><br/>1.4.1 Description<br/><br/>1.4.2 Estimation<br/><br/>1.4.3 Classification<br/><br/>1.4.4 Clustering<br/><br/>1.4.5 Prediction<br/><br/>1.4.6 Association<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 2 THE BASICS OF PYTHON AND R<br/><br/>2.1 Downloading Python<br/><br/>2.2 Basics of Coding in Python<br/><br/>2.2.1 Using Comments in Python<br/><br/>2.2.2 Executing Commands in Python<br/><br/>2.2.3 Importing Packages in Python<br/><br/>2.2.4 Getting Data into Python<br/><br/>2.2.5 Saving Output in Python<br/><br/>2.2.6 Accessing Records and Variables in Python<br/><br/>2.2.7 Setting Up Graphics in Python<br/><br/>2.3 Downloading R and Rstudio<br/><br/>2.4 Basics of Coding in R<br/><br/>2.4.1 Using Comments in R<br/><br/>2.4.2 Executing Commands in R<br/><br/>2.4.3 Importing Packages in R<br/><br/>2.4.4 Getting Data into R<br/><br/>2.4.5 Saving Output in R<br/><br/>2.4.6 Accessing Records and Variables in R<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 3 DATA PREPARATION<br/><br/>3.1 The Bank Marketing Data Set<br/><br/>3.2 The Problem Understanding Phase<br/><br/>3.2.1 Clearly Enunciate the Project Objectives<br/><br/>3.2.2 Translate These Objectives into a Data Science Problem<br/><br/>3.3 Data Preparation Phase<br/><br/>3.4 Adding an Index Field<br/><br/>3.4.1 How to Add an Index Field Using Python<br/><br/>3.4.2 How to Add an Index Field Using R<br/><br/>3.5 Changing Misleading Field Values<br/><br/>3.5.1 How to Change Misleading Field Values Using Python<br/><br/>3.5.2 How to Change Misleading Field Values Using R<br/><br/>3.6 Reexpression of Categorical Data as Numeric<br/><br/>3.6.1 How to Reexpress Categorical Field Values Using Python<br/><br/>3.6.2 How to Reexpress Categorical Field Values Using R<br/><br/>3.7 Standardizing the Numeric Fields<br/><br/>3.7.1 How to Standardize Numeric Fields Using Python<br/><br/>3.7.2 How to Standardize Numeric Fields Using R<br/><br/>3.8 Identifying Outliers<br/><br/>3.8.1 How to Identify Outliers Using Python<br/><br/>3.8.2 How to Identify Outliers Using R<br/><br/>References<br/><br/>Exercises 45<br/><br/> <br/><br/> <br/><br/>CHAPTER 4 EXPLORATOR Y DATA ANALYSIS<br/><br/>4.1 Eda Versus HT<br/><br/>4.2 Bar Graphs with Response Overlay<br/><br/>4.2.1 How to Construct a Bar Graph with Overlay Using Python<br/><br/>4.2.2 How to Construct a Bar Graph with Overlay Using R<br/><br/>4.3 Contingency Tables<br/><br/>4.3.1 How to Construct Contingency Tables Using Python<br/><br/>4.3.2 How to Construct Contingency Tables Using R<br/><br/>4.4 Histograms with Response Overlay<br/><br/>4.4.1 How to Construct Histograms with Overlay Using Python<br/><br/>4.4.2 How to Construct Histograms with Overlay Using R<br/><br/>4.5 Binning Based on Predictive Value<br/><br/>4.5.1 How to Perform Binning Based on Predictive Value Using Python<br/><br/>4.5.2 How to Perform Binning Based on Predictive Value Using R<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 5 PREPARING TO MODEL THE DATA<br/><br/>5.1 The Story So Far<br/><br/>5.2 Partitioning the Data<br/><br/>5.2.1 How to Partition the Data in Python<br/><br/>5.2.2 How to Partition the Data in R<br/><br/>5.3 Validating Your Partition<br/><br/>5.4 Balancing the Training Data Set<br/><br/>5.4.1 How to Balance the Training Data Set in Python<br/><br/>5.4.2 How to Balance the Training Data Set in R<br/><br/>5.5 Establishing Baseline Model Performance<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 6 DECISION TREES<br/><br/>6.1 Introduction to Decision Trees<br/><br/>6.2 Classification and Regression Trees<br/><br/>6.2.1 How to Build CART Decision Trees Using Python<br/><br/>6.2.2 How to Build CART Decision Trees Using R<br/><br/>6.3 The C5.0 Algorithm for Building Decision Trees<br/><br/>6.3.1 How to Build C5.0 Decision Trees Using Python<br/><br/>6.3.2 How to Build C5.0 Decision Trees Using R<br/><br/>6.4 Random Forests<br/><br/>6.4.1 How to Build Random Forests in Python<br/><br/>6.4.2 How to Build Random Forests in R<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 7 MODEL EVALUATION<br/><br/>7.1 Introduction to Model Evaluation<br/><br/>7.2 Classification Evaluation Measures<br/><br/>7.3 Sensitivity and Specificity<br/><br/>7.4 Precision, Recall, and Fβ Scores<br/><br/>7.5 Method for Model Evaluation<br/><br/>7.6 An Application of Model Evaluation<br/><br/>7.6.1 How to Perform Model Evaluation Using R<br/><br/>7.7 Accounting for Unequal Error Costs<br/><br/>7.7.1 Accounting for Unequal Error Costs Using R<br/><br/>7.8 Comparing Models with and Without Unequal Error Costs<br/><br/>7.9 Data-Driven Error Costs<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 8 NAÏVE BAYES CLASSIFICATION<br/><br/>8.1 Introduction to Naïve Bayes<br/><br/>8.2 Bayes Theorem<br/><br/>8.3 Maximum a Posteriori Hypothesis<br/><br/>8.4 Class Conditional Independence<br/><br/>8.5 Application of Naïve Bayes Classification<br/><br/>8.5.1 Naïve Bayes in Python<br/><br/>8.5.2 Naïve Bayes in R<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 9 NEURAL NETWORKS<br/><br/>9.1 Introduction to Neural Networks<br/><br/>9.2 The Neural Network Structure<br/><br/>9.3 Connection Weights and the Combination Function<br/><br/>9.4 The Sigmoid Activation Function<br/><br/>9.5 Backpropagation<br/><br/>9.6 An Application of a Neural Network Model<br/><br/>9.7 Interpreting the Weights in a Neural Network Model<br/><br/>9.8 How to Use Neural Networks in R<br/><br/>9.9 How to Use Neural Networks in Python<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 10 CLUSTERING<br/><br/>10.1 What Is Clustering?<br/><br/>10.2 Introduction to the k-Means Clustering Algorithm<br/><br/>10.3 An Application of k-Means Clustering<br/><br/>10.4 Cluster Validation<br/><br/>10.5 How to Perform k-Means Clustering Using Python<br/><br/>10.5.1 k-Means Python Example Using Sklearn<br/><br/>10.6 How to Perform k-Means Clustering Using R<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 11 REGRESSION MODELING<br/><br/>11.1 The Estimation Task<br/><br/>11.2 Descriptive Regression Modeling<br/><br/>11.3 An Application of Multiple Regression Modeling<br/><br/>11.4 How to Perform Multiple Regression Modeling Using Python<br/><br/>11.5 How to Perform Multiple Regression Modeling Using Sklearn Python<br/><br/>11.6 How to Perform Multiple Regression Modeling Using R<br/><br/>11.7 Model Evaluation for Estimation<br/><br/>11.7.1 How to Perform Stepwise Regression Using Python<br/><br/>11.7.2 How to Perform Estimation Model Evaluation Using Python<br/><br/>11.7.3 How to Perform Estimation Model Evaluation Using R<br/><br/>11.8 Stepwise Regression<br/><br/>11.8.1 How to Perform Stepwise Regression Using R<br/><br/>11.9 Baseline Models for Regression<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 12 DIMENSION REDUCTION<br/><br/>12.1 The Need for Dimension Reduction<br/><br/>12.2 Multicollinearity<br/><br/>12.3 Identifying Multicollinearity Using Variance Inflation Factors<br/><br/>12.3.1 How to Identify Multicollinearity Using Python<br/><br/>12.3.2 How to Identify Multicollinearity in R<br/><br/>12.4 Principal Components Analysis<br/><br/>12.5 An Application of Principal Components Analysis<br/><br/>12.6 How Many Components Should We Extract?<br/><br/>12.6.1 The Eigenvalue Criterion<br/><br/>12.6.2 The Proportion of Variance Explained Criterion<br/><br/>12.7 Performing PCA with k = 4<br/><br/>12.8 Validation of the Principal Components<br/><br/>12.9 How to Perform Principal Components Analysis Using Python<br/><br/>12.10 How to Perform Principal Components Analysis Using R<br/><br/>12.11 When Is Multicollinearity Not a Problem?<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>CHAPTER 13 GENERALIZED LINEAR MODELS<br/><br/>13.1 An Overview of General Linear Models<br/><br/>13.2 Linear Regression As a General Linear Model<br/><br/>13.3 Logistic Regression As a General Linear Model<br/><br/>13.4 An Application of Logistic Regression Modeling<br/><br/>13.4.1 How to Perform Logistic Regression Using Python<br/><br/>13.4.2 How to Perform Logistic Regression Using R<br/><br/>13.5 Poisson Regression<br/><br/>13.6 An Application of Poisson Regression Modeling<br/><br/>13.6.1 How to Perform Poisson Regression Using Python<br/><br/>13.6.2 How to Perform Poisson Regression Using R<br/><br/>Reference<br/><br/>Exercises<br/><br/> <br/><br/> <br/><br/> <br/><br/> <br/><br/> <br/><br/>CHAPTER 14 ASSOCIATION RULES<br/><br/>14.1 Introduction to Association Rules<br/><br/>14.2 A Simple Example of Association Rule Mining<br/><br/>14.3 Support, Confidence, and Lift<br/><br/>14.4 Mining Association Rules<br/><br/>14.4.1 How to Mine Association Rules Using R<br/><br/>14.5 Confirming Our Metrics<br/><br/>14.6 The Confidence Difference Criterion<br/><br/>14.6.1 How to Apply the Confidence Difference Criterion Using R<br/><br/>14.7 The Confidence Quotient Criterion<br/><br/>14.7.1 How to Apply the Confidence Quotient Criterion Using R<br/><br/>Valediction<br/><br/>References<br/><br/>Exercises<br/><br/> <br/><br/>APPENDIX DATA SUMMARIZATION AND VISUALIZATION<br/><br/>Part 1 Summarization 1: Building Blocks of Data Analysis<br/><br/>Part 2 Visualization: Graphs and Tables for Summarizing and Organizing Data<br/><br/>A.1 Categorical Variables<br/><br/>A.2 Quantitative Variables<br/><br/>Part 3 Summarization 2: Measures of Center, Variability, and Position<br/><br/>Part 4 Summarization and Visualization of Bivariate Relationships<br/><br/>INDEX
Statement of responsibility Chantal D. Larose<br/>Chantal D. Larose earned her PhD in Statistics from the University of Connecticut in 2015, focusing her dissertation on Model-Based Clustering of Incomplete Data. As an Assistant Professor of Decision Science at SUNY New Paltz, she played a pivotal role in developing the Bachelor of Science in Business Analytics program. Currently, she serves as an Assistant Professor of Statistics and Data Science at Eastern Connecticut State University, contributing to the design of the Mathematical Sciences Department’s data science curriculum.<br/><br/>Daniel T. Larose<br/>Daniel T. Larose completed his PhD in Statistics from the University of Connecticut in 1996, with his dissertation titled Bayesian Approaches to Meta-Analysis. A Professor of Statistics and Data Science at Central Connecticut State University, he pioneered the world’s first online Master of Science in Data Mining in 2001. As the author or coauthor of 12 textbooks, Daniel also directs the online Master of Data Science program at CCSU and operates a consulting business.<br/><br/>Shaukat Ali Shahee<br/>Shaukat Ali Shahee received his PhD from the SJM School of Management at the Indian Institute of Technology Bombay, where his research addressed challenges in analyzing imbalanced data with diverse intrinsic characteristics. His work has been featured in renowned journals such as International Journal of Artificial Intelligence and Soft Computing, Applied Intelligence, and Data Mining and Knowledge Discovery, as well as in the prestigious Advances in Data Mining book series. With 5.5 years of industry experience, he has served as a Quantitative Research Analyst at AlphaCrest Capital Management, a Deputy Manager at Bank of Maharashtra, and a Research Engineer at the IIT Bombay CSE department.
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element The basics of python and R
9 (RLIN) 254222
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Exploratory data analysis
9 (RLIN) 254223
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Naive bayes classification
9 (RLIN) 254224
700 ## - ADDED ENTRY--PERSONAL NAME
Personal name Larose, Daniel T.
9 (RLIN) 254225
700 ## - ADDED ENTRY--PERSONAL NAME
Personal name Shahee, Shaukat Ali.
9 (RLIN) 254226
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Book
Edition 1st
Call number prefix 005.76 LARC
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Collection code Home library Current library Shelving location Date acquired Source of acquisition Cost, normal purchase price Inventory number Total Checkouts Full call number Barcode Date last seen Cost, replacement price Price effective from Koha item type
    Dewey Decimal Classification     MCA St Aloysius Institute of Management & Information Technology St Aloysius Institute of Management & Information Technology Data Science 02/03/2026 KL Book House 959.00 Bill no:1288; Bill dt:2026-01-23   005.76 LARC MCA17363 05/23/2026 719.25 02/10/2026 Book