Instructor-led 3-days
Course Description
This 3-day Data Science course introduces professionals and aspiring analysts to the essential tools and techniques used in modern data workflows. Participants will learn the complete data science process, from data collection and cleaning to exploration, modeling, and communication. Using Python, Jupyter notebooks, and visualization tools like Tableau, attendees will gain hands-on experience working with real-world datasets.
Key Takeaways
- Gain insight into the data science lifecycle and how to solve data-driven problems
- Work directly with Python and leading data science libraries
- Learn techniques for data wrangling, transformation, and preparation
- Create compelling visualizations using Tableau, Matplotlib, and Seaborn
- Build and evaluate basic machine learning models
- Communicate results with clear visual output and structured reports
Prerequisites
- Basic knowledge of statistics and mathematics
- Prior exposure to programming concepts, preferably in Python
- General interest in data analysis and practical problem solving
Module 1: Foundations of Data Science and Python
- Introduction to Data Science: Lifecycle and Applications
- Setting Up the Environment: Jupyter, Anaconda, IDEs
- Python Essentials: Data types, loops, functions, and control flow
- Working with Data: NumPy and Pandas fundamentals
- Loading and Cleaning Datasets
- Hands-On Lab: Read, inspect, and clean real-world datasets
Module 2: Exploratory Data Analysis and Visualization with Tableau and Python
- Introduction to Exploratory Data Analysis (EDA)
- Summary Statistics and Pattern Detection
- Data Visualization with Tableau, Matplotlib, and Seaborn
- Identifying Correlations, Outliers, and Distributions
- Data Transformation Techniques
- Hands-On Lab: Perform EDA and create visualizations using Tableau and Python
Module 3: Introduction to Machine Learning
- Overview of Supervised and Unsupervised Learning
- Machine Learning Workflow: Training, Testing, Evaluating
- Building Models with Scikit-learn (e.g., Linear Regression, Decision Trees, KNN)
- Evaluation Metrics: Accuracy, Precision, Recall, F1 Score
- Overfitting and Cross-validation Techniques