Python For Data Science

A Python tutorial for (beginning) data scientists with a non-computer-science background.

Introduction

Let's take a look at the following snippet of Pandas code:

If this makes you curious about:

how loc[] works
how it processes these expressions inside the square brackets
how these mystical lambda expressions work
what part of your input data is pointed to by these df references,

then this course is right for you!

Instead of offering a comprehensive guided tour through the Pandas or Scikit-Learn APIs, this tutorial aims at empowering you to understand how these tools work by explaining the fundamental programming constructs from which these tools are built. This will allow you to:

effectively learn parts of the APIs that are relevant to you without supervision
write correct code with fewer bugs by knowing what you're doing
write readable and testable code, so your colleagues know what you're doing
create your own tools

Prerequisites (Skills, Knowledge)

To fully enjoy this tutorial, you are expected to:

Be able to clone this repository using any git client
Have some programming experience
Have some conceptual understanding of basic data analysis such as filtering/selection, grouping by, aggregation, transformation

While the tutorial starts with an introduction into the most basic Python concepts, the pace will be too high for people with no coding experience whatsoever.

Prerequisites (Technical)

To run this tutorial, you need at least:

Any computing environment with Python >= 3.7 installed and available
Recent Pandas (>= 1.0) and Numpy packages available globally or in a virtual environment
Jupyter Lab (or simple Notebook) installed globally or in a virtual environment
Any git client that allows you to clone this repository

In case of any trouble meeting these prerequisites, this article at Real Python can be helpful. I suggest using a pyenv+pipenv - based environment or a conda - based environment and not creating a mix of these approaches.

Use this tutorial by cloning this repository into a directory of choice and running jupyter lab or jupyter notebook from that directory.

Name	Name	Last commit message	Last commit date
Latest commit History 80 Commits
appendices	appendices
data	data
images	images
solutions	solutions
.gitignore	.gitignore
00_introduction.ipynb	00_introduction.ipynb
01_python_basics.ipynb	01_python_basics.ipynb
02_object_oriented_programming.ipynb	02_object_oriented_programming.ipynb
03_functional_programming.ipynb	03_functional_programming.ipynb
04_best_practices.ipynb	04_best_practices.ipynb
LICENSE	LICENSE
Pipfile	Pipfile
Pipfile.lock	Pipfile.lock
README.md	README.md
requirements.txt	requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python For Data Science

Introduction

Prerequisites (Skills, Knowledge)

Prerequisites (Technical)

About

Uh oh!

Releases

Packages

Languages

Search code, repositories, users, issues, pull requests...

License

jsamoocha/python-for-datascience

Folders and files

Latest commit

History

Repository files navigation

Python For Data Science

Introduction

Prerequisites (Skills, Knowledge)

Prerequisites (Technical)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages