Python

Simple Python Guide For Beginners

22 min read

Introduction to Pandas Library

In this section, we will introduce the Pandas library in Python. Pandas is a powerful data manipulation and analysis library that provides easy-to-use data structures and data analysis tools. We will cover topics such as Pandas data structures, data handling, data cleaning, and basic data analysis using Pandas.

Pandas Data Structures

Pandas provides two primary data structures: Series and DataFrame.

Series

A Series is a one-dimensional labeled array that can hold any data type. It is similar to a column in a spreadsheet or a single column of a SQL table.

import pandas as pd
# Create a Series
s = pd.Series([3, 1, 5, 2, 4])
print(s)
DataFrame

A DataFrame is a two-dimensional labeled data structure with columns of potentially different data types. It is similar to a spreadsheet or a SQL table.

import pandas as pd
# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'London', 'Paris', 'Sydney']}
df = pd.DataFrame(data)
print(df)
Data Handling and Cleaning

Pandas provides numerous functions and methods for handling and cleaning data, such as selecting and filtering data, handling missing values, and transforming data.

Selecting and Filtering Data

You can select specific rows or columns from a DataFrame based on certain conditions using boolean indexing or label-based indexing.

import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'London', 'Paris', 'Sydney']}
df = pd.DataFrame(data)
# Select rows with Age greater than 30
filtered_df = df[df['Age'] > 30]
print(filtered_df)
Handling Missing Values

Pandas provides functions to handle missing values, such as `isnull()`, `fillna()`, and `dropna()`. These functions allow you to identify, replace, or remove missing values in your data.

import pandas as pd
import numpy as np
data = {'Name': ['Alice', 'Bob', np.nan, 'David'],
        'Age': [25, 30, np.nan, 40],
        'City': ['New York', 'London', 'Paris', np.nan]}
df = pd.DataFrame(data)
# Check for missing values
print(df.isnull())
# Fill missing values with a specific value
df_filled = df.fillna('Unknown')
print(df_filled)
# Drop rows with missing values
df_dropped = df.dropna()
print(df_dropped)
Basic Data Analysis with Pandas

Pandas provides a wide range of functions for basic data analysis, such as descriptive statistics, grouping and aggregating data, and merging datasets.


Content List