Exploratory Data Analysis with Python

(EDA-PYTHON.AJ1)/ISBN:978-1-64459-298-4

This course includes
Lessons
TestPrep
Hands-On Labs
AI Tutor (Add-on)

Lessons

13+ Lessons | 47+ Exercises | 63+ Quizzes | 80+ Flashcards | 80+ Glossary of terms

TestPrep

35+ Pre Assessment Questions | 35+ Post Assessment Questions |

Hands-On Labs

77+ LiveLab | 13+ Video tutorials | 20+ Minutes

Here's what you will learn

Download Course Outline

Lessons 1: Preface

  • Who this course is for?
  • What this course covers?
  • To get the most out of this course
  • Conventions used

Lessons 2: Exploratory Data Analysis Fundamentals

  • Understanding data science
  • The significance of EDA
  • Making sense of data
  • Comparing EDA with classical and Bayesian analysis
  • Software tools available for EDA
  • Getting started with EDA
  • Summary
  • Further reading

Lessons 3: Visual Aids for EDA

  • Technical requirements
  • Line chart
  • Bar charts
  • Scatter plot
  • Area plot and stacked plot
  • Pie chart
  • Table chart
  • Polar chart
  • Histogram
  • Lollipop chart
  • Choosing the best chart
  • Other libraries to explore
  • Summary
  • Further reading

Lessons 4: Activity: EDA with Personal Email

  • Technical requirements
  • Loading the dataset
  • Data transformation
  • Data analysis
  • Summary
  • Further reading

Lessons 5: Data Transformation

  • Technical requirements
  • Background
  • Merging database-style dataframes
  • Transformation techniques
  • Benefits of data transformation
  • Summary
  • Further reading

Lessons 6: Descriptive Statistics

  • Technical requirements
  • Understanding statistics
  • Measures of central tendency
  • Measures of dispersion
  • Summary
  • Further reading

Lessons 7: Grouping Datasets

  • Technical requirements
  • Understanding groupby()
  • Groupby mechanics
  • Data aggregation
  • Pivot tables and cross-tabulations
  • Summary
  • Further reading

Lessons 8: Correlation

  • Technical requirements
  • Introducing correlation
  • Types of analysis
  • Discussing multivariate analysis using the Titanic dataset
  • Outlining Simpson's paradox
  • Correlation does not imply causation
  • Summary
  • Further reading

Lessons 9: Activity: Time Series Analysis

  • Technical requirements
  • Understanding the time series dataset
  • TSA with Open Power System Data
  • Summary
  • Further reading

Lessons 10: Hypothesis Testing and Regression

  • Hypothesis testing
  • p-hacking
  • Understanding regression
  • Model development and evaluation
  • Summary
  • Further reading

Lessons 11: Model Development and Evaluation

  • Technical requirements
  • Types of machine learning
  • Understanding supervised learning
  • Understanding unsupervised learning
  • Understanding reinforcement learning
  • Unified machine learning workflow
  • Summary
  • Further reading

Lessons 12: Activity: EDA on Wine Quality Data Analysis

  • Technical requirements
  • Disclosing the wine quality dataset
  • Analyzing red wine
  • Analyzing white wine
  • Model development and evaluation
  • Summary
  • Further reading

Appendix

  • String manipulation
  • Using pandas vectorized string functions
  • Using regular expressions
  • Further reading

Hands-on LAB Activities

Exploratory Data Analysis Fundamentals

  • Styling a Dataframe
  • Applying Function to a Dataframe
  • Slicing and Subsetting
  • Dividing NumPy Arrays
  • Inspecting NumPy Arrays
  • Defining NumPy arrays
  • Selecting rows
  • Reading Data from a CSV File
  • Creating a Dataframe

Visual Aids for EDA

  • Creating a Line chart
  • Creating a Bar Chart
  • Creating a Scatter Plot
  • Creating a Bubble Chart
  • Creating an Area Plot
  • Creating a Pie Chart
  • Creating a Table Chart
  • Creating a Polar Chart
  • Adding the Best-Fit Line for the Normal Distribution
  • Creating a Histogram
  • Creating a Lollipop Chart

Activity: EDA with Personal Email

  • Performing EDA with Email Data
  • Extracting Email Using Regex
  • Converting a Field to datetime
  • Removing NaN Values
  • Dropping a Column

Data Transformation

  • Stacking a Dataframe
  • Concatenating Dataframes
  • Analyzing Dataframes
  • Combining Dataframes
  • Merging on Index
  • Permuting a Dataframe
  • Removing Duplicate Data
  • Replacing Values
  • Interpolating Missing Values
  • Backward and Forward Filling
  • Handling NaN values
  • Counting Missing Values
  • Renaming Axis Indexes
  • Binning
  • Detecting Outliers

Descriptive Statistics

  • Generating a Binomial Distribution Plot
  • Generating an Exponential Distribution Plot
  • Generating a Normal Distribution Plot
  • Generating a Uniform Distribution Plot
  • Using Statistical Functions
  • Calculating Standard Deviation
  • Finding Skewness and Kurtosis
  • Creating a Box Plot
  • Calculating Inter-Quartile Range

Grouping Datasets

  • Finding Maximum Value for Each Group
  • Grouping a Dataset
  • Filtering Data
  • Applying Aggregation Functions
  • Creating a Pivot Table
  • Creating a Cross-Tabulation Table

Correlation

  • Calculating Correlation Coefficient

Activity: Time Series Analysis

  • Sampling the Data
  • Resampling the Data
  • Changing the Index of a Dataframe

Hypothesis Testing and Regression

  • Performing Z-Test
  • Calculating the P-Value
  • Performing T-test
  • Scoring the Model
  • Understanding the Linear Regression Model

Model Development and Evaluation

  • Using TfidfVectorizer

Activity: EDA on Wine Quality Data Analysis

  • Plotting a Heatmap
  • Visualizing the Data in 3D Form

Appendix

  • Accessing Characters
  • String Slicing
  • Updating a String
  • Escape Sequencing
  • Formatting Strings
  • Displaying Last 10 items from a Dataframe
  • Using String Functions with a Dataframe
  • Finding Words from a String
  • Counting Full Stops using Regex
  • Matching Characters