Wednesday 7 August 2024

Complete Data Science Bootcamp : Step By Step Hands-On Labs

 Complete Data Science Bootcamp : Step By Step Hands-On Labs


It involves the use of techniques from statistics, computer science, and domain-specific knowledge to analyze and interpret data in order to make predictions, discover patterns, and gain insights. Data science is used in a wide range of industries, including healthcare, finance, marketing, and manufacturing, to make data-driven decisions and improve business outcomes.

This blog post helps you with your self-paced learning as well as with your team learning. There are many Hands-On Labs in this course.
Here’s a quick sneak-peak of how to start learning Data Science For Beginners by doing Hands-on.

Learning Path PythonModule 1: Python for Data Science

1) Environment Setup: Install Jupyter Notebooks

There are two ways to Install the Jupyter Notebook.

1. Using the pip command
We can use pip to install Jupyter Notebook using the following command:

$ pip install jupyter

2. Anaconda
We can also use Anaconda, which is a Python data science platform. Anaconda has its own installer named conda that we can use to install Jupyter Notebook.

2) Try Jupyter Notebook: Hello World!

We can print anything in python jupyter notebook by using ‘print(” “)‘ Syntax.

k21 academy

3) Working with Variables

Python has no command for declaring a variable. A variable is created the moment you first assign a value to it.

k21 academy

Python supports the usual logical conditions from mathematics:

  • Equals: a == b
  • Not Equals: a != b
  • Less than: a < b
  • Less than or equal to: a <= b
  • Greater than: a > b
  • Greater than or equal to: a >= b

These conditions can be used in several ways, most commonly in “if statements” and loops.

4) Understand the if-loop statement

An “if statement” is written by using the if keyword.

k21 academy

In this example, we use two variables, a and b, which are used as part of the if statement to test whether b is greater than a. As a is 33, and b is 200, we know that 200 is greater than 33, and so we print to screen that “b is greater than a“.

5) Understand For loop statement

A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string). This is less like the for a keyword in other programming languages and works more like an iterator method as found in other object-orientated programming languages.  With the for loop we can execute a set of statements, once for each item in a list, tuple, set etc.
Print each fruit in a fruit list:

k21 academy

The for loop does not require an indexing variable to set beforehand.

6) Understand While loop statement

With the while loop we can execute a set of statements as long as a condition is true.

k21 academy

Note: remember to increment i, or else the loop will continue forever.
The while loop requires relevant variables to be ready, in this example, we need to define an indexing variable, i, which we set to 1.

Module 2: Operators and Keywords

1) Create & Work with Lists

Lists are one of 4 built-in data types in Python used to store collections of data. Lists are used to store multiple items in a single variable.
Lists are created using square[] brackets:

k21 academy

List items are ordered, changeable, and allow duplicate values. List items are indexed, the first item has index [0], the second item has index [1] etc.

2) Working with Tuples

Tuples are used to store multiple items in a single variable. A tuple is a collection that is ordered and unchangeable.
Tuples are written with round() brackets.

k21 academy

Tuple items allow duplicate values.

3) Sets & Exercises

Sets are used to store multiple items in a single variable. A set is a collection that is both unordered and unindexed.
Sets are written with curly{} brackets.

k21 academy

Set items are unordered, unchangeable, and do not allow duplicate values.

4) Create & Understand Dictionaries

Dictionaries are used to store data values in key: value pairs. A dictionary is a collection which is ordered, changeable and does not allow duplicates.
Dictionaries are written with curly brackets, and have keys and values:

k21 academy

Dictionary items are presented in key: value pairs, and can be referred to by using the key name.

Module 3: NumPy & Pandas

1) Create & work with NumPy Arrays

The array object in NumPy is called ndarray. We can create a NumPy ndarray object by using the array() function. NumPy is a Python library used for working with arrays.

k21 academy

2) Create Pandas Dataframe

A Pandas DataFrame is a 2-dimensional data structure, like a 2-dimensional array, or a table with rows and columns.
Create a simple Pandas DataFrame:

k21 academy

3) Pandas Dataframe: load csv files

A simple way to store big data sets is to use CSV files (comma-separated files). CSV files contain plain text and are a good know format that can be read by everyone including Pandas. In our examples, we will be using a CSV file called ‘data.csv’.

k21 academy

Tip: use  to_string() to print the entire DataFrame.

Module 4: Function, Classes & Oops

1) Working with User-defined Methods

function is a block of code that only runs when it is called. You can pass data, known as parameters, into a function. A function can return data as a result.
In Python a function is defined using the def keyword:

k21 academy

2) Working with Inbuilt Methods

Inbuilt functions are functions that are already pre-defined. You just have to call the function and don’t worry about creating. In python there are many pre-defined functions, here we are gone pick one or two functions for understanding clearly.

  • abs(): Returns the absolute value of the given number and returns a magnitude of a complex number.

k21 academy

  • chr(): This Built-In function returns the character in python for an ASCII value.

k21 academy

and there are many more built-in functions.

3) Implementing User-defined Functions (Create, Call)

User-defined functions are functions that you use to organize your code in the body of a policy. Once you define a function, you can call it in the same way as the built-in functions.

 

k21 academy

To call a function, use the function name followed by a parenthesis.

4) Implementing Inbuilt Functions

Here we gonna see some important inbuilt functions which we are gonna use frequently.

The min() function returns the item with the lowest value or the item with the lowest value in an iterable. If the values are strings, an alphabetical comparison is done.
Return the item in a tuple with the lowest value:

k21 academy

5) Create Classes & Objects in Python

A Class is like an object constructor or a “blueprint” for creating objects. To create a class, use the keyword class.
Create a class named MyClass, with a property named x:

k21 academy

Now we can use the class named MyClass to create objects.
Create an object named p1, and print the value of x:

k21 academy

6) Understand the Inheritance Concept

Inheritance allows us to define a class that inherits all the methods and properties from another class. The parent class is the class being inherited from, also called base class. A child class is a class that inherits from another class, also called a derived class. Any class can be a parent class, so the syntax is the same as creating any other class.
Create a class named Person, with first name and last name properties, and a print name method:

k21 academy

To create a class that inherits the functionality from another class, send the parent class as a parameter when creating the child class.
Create a class named Student, which will inherit the properties and methods from the Person class:

k21 academy

Note: Use the pass keyword when you do not want to add any other properties or methods to the class.

Now the Student class has the same properties and methods as the Person class.
Use the Student class to create an object, and then execute the print name method:

k21 academy

Module 5: Data Science essential Libraries

There are many libraries available for data science, but some of the most essential libraries include:

  1. NumPy: This library provides support for large, multi-dimensional arrays and matrices of numerical data, as well as a large collection of mathematical functions to operate on these arrays.
    numpy.sum() in Python -k21 academy
  2. Pandas: This library provides fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It is a must-have for data cleaning, transformation, and manipulation.
    Python Pandas DataFrame - k21 academy
  3. Matplotlib: This library is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK.
    Matplotlib - k21 academy
  4. Scikit-learn: This library is a machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means, etc.
    AI Tools - k21 academy
  5. TensorFlow or PyTorch: These are open-source machine learning libraries that allow you to build and train neural networks. TensorFlow is developed by Google and PyTorch is developed by Facebook.
    What is Tensorflow | TensorFlow Introduction -k21 academy
  6. Seaborn: This library is a data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
    Introduction of Seaborn - k21 academy
  7. Plotly is a charting library that comes with over 40 chart types, 3D charts, statistical graphs, and SVG maps.
    which includes:

    • Scatter Plots
    • Line Graphs
    • Linear Graphs
    • Multiple Lines
    • Bar Charts
    • Horizontal Bar Charts
    • Pie Charts
    • Donut Charts
    • Plotting Equations

These libraries are widely used in the data science community and provide a solid foundation for many data science tasks. However, depending on the specific problem or task, other libraries may also be useful, such as NLTK for natural language processing or StatsModels for statistical modeling

No comments:

Post a Comment