Skip to content

liapsps/data-science-python-libraries-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Data Science Libraries in Python

This repository is written in English to reach a wider audience.

Welcome to the Data Science Libraries in Python repository! This project is designed as a didactic resource to explore and understand essential Python libraries to help with your Data Science studies. Here, you'll find useful informations that'll help you master these tools.

Objectives

  • Learn the purpose of each library: Understand what each library is used for and its importance in the data science workflow.
  • Practice with code examples: Explore clear, didactic examples using fictitious datasets.
  • Access curated resources: Find links to documentation, books, and courses for deeper learning.

Repository Structure

The repository is organized into folders by theme to make navigation intuitive:

├── README.md
├── Data Science Workflow/
    ├── Data Manipulation/
    │   ├── numpy_basics.ipynb
    │   ├── pandas_data_cleaning.ipynb
    ├── Data Visualization/
    │   ├── matplotlib_basics.ipynb
    │   ├── seaborn_heatmaps.ipynb
    │   ├── plotly_interactive_charts.ipynb
    ├── Machine Learning/
        ├── sklearn_regression.ipynb
        ├── sklearn_classification.ipynb
        ├── xgboost_basics.ipynb

Each folder contains Jupyter notebooks that:

  • Introduce the library: Highlight its main features and applications.
  • Provide code examples: Demonstrate common tasks and workflows.
  • Include comments: Explain each step of the code for better understanding.

Themes Covered

  1. Data Manipulation

    • numpy: Numerical computing with multi-dimensional arrays.
    • pandas: Data manipulation and analysis with DataFrames.
  2. Data Visualization

    • matplotlib: Creating static, animated, and interactive plots.
    • seaborn: Statistical data visualization built on Matplotlib.
    • plotly: Interactive visualizations and dashboards.
  3. Machine Learning

    • scikit-learn: Essential tools for machine learning (classification, regression, clustering, etc.).
    • xgboost: Gradient boosting for structured data.

Resources


Roadmap

This is a dynamic project with ongoing updates. Here's the plan:

  1. Initial Setup

    • ✅ Create folders and templates for each theme.
    • 🔄 Add basic examples for NumPy and Pandas. (In Progress ⬅️)
  2. Expand Visualization Examples

    • Add advanced plots in Seaborn.
    • Create interactive dashboards with Plotly.
  3. Machine Learning Use Cases

    • Include real-world scenarios for regression and classification.
    • Add examples with XGBoost.
  4. Polishing and Documentation

    • Refine code comments.
    • Add Markdown explanations for workflows.

Happy learning! 🚀

About

A guide to Data Science Libraries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published