4. Data Visualization

using Matplotlib and Seaborn

Seaborn is built on Matplotlib but provides more beautiful, easy-to-use and high-level visualizations. It has built-in functions to work directly with Pandas DataFrames. Seaborn makes it easier to create visually appealing charts with less code.

Graph Type

What It Does?

Syntax Example

Set Figure Size

Defines the plot size (width, height).

plt.figure(figsize=(8,5))

Histogram (User Click Distribution)

Shows how often different click counts occur among users.

sns.histplot(df["clicks"], bins=10, kde=True, color="skyblue")

Box Plot (Click Outliers & Spread)

Shows minimum, maximum, median, and outliers of user clicks.

sns.boxplot(x=df["clicks"], color="orange")

Bar Chart (A/B Group Click Comparison)

Compares average clicks between Group A & Group B.

sns.barplot(x=df["group"], y=df["clicks"], estimator=np.mean, palette="viridis")

Pie Chart (Conversion Rate Breakdown)

Shows percentage of users who converted vs. didn’t convert.

plt.pie(df["converted"].value_counts(), labels=["Not Converted", "Converted"], autopct="%1.1f%%", colors=["red", "green"])

Scatter Plot (Clicks vs Age)

Shows if older or younger users click more.

sns.scatterplot(x=df["age"], y=df["clicks"], hue=df["group"], palette="coolwarm")

X-Axis Label

Adds a label for the X-axis (data categories).

plt.xlabel("Number of Clicks")

Y-Axis Label

Adds a label for the Y-axis (frequency or count).

plt.ylabel("Frequency")

Title for the Graph

Adds a title for better understanding.

plt.title("User Engagement Distribution")

Show the Final Graph

Displays the plot after modifications.

plt.show()

🔹 Adjust Y-Axis Ticks to Whole Numbers

Ensures frequency is shown in whole numbers.

plt.yticks(range(0, len(df)+1, 1))

plt.figure(figsize=(8,5))
sns.histplot(data, bins=number_of_bins, kde=True, color="color_name")
plt.title("Title of the Histogram")
plt.xlabel("X-axis Label (e.g., Age, Clicks, Sales)")
plt.ylabel("Frequency")
plt.show()

Last updated