MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

linux day #6

2025-12-28 16:19:51

Full Ubuntu Version Upgrade (Release Upgrade)

1. What Is a “Full Software Upgrade”?

So far, we have used commands like:

  • apt update
  • apt upgrade
  • apt dist-upgrade
  • apt full-upgrade

These commands upgrade packages only, but they do NOT upgrade the Ubuntu OS version itself.

Ubuntu releases a new OS version every 6 months.
A release upgrade means upgrading from one Ubuntu version to another (for example: 22.04 → 22.10 → 23.04).

2. Important Difference: Package Upgrade vs Release Upgrade

Type What it upgrades Ubuntu version changes?
apt upgrade Installed packages ❌ No
apt dist-upgrade Packages + dependencies ❌ No
do-release-upgrade Entire OS version ✅ Yes

3. Critical Things to Check Before Upgrading

A release upgrade is risky, especially on servers. Before upgrading, always check the following.

3.1 Full Backup (Mandatory)

  • Always have a verified backup
  • Backup must be accessible even if the system fails to boot
  • For cloud servers: snapshot + off-server backup

If the system becomes unbootable, the backup is your only recovery.

3.2 Disk Space

You need several GB of free space.

Check disk usage:

df -h

Example:

  • 20% used
  • 80% free → safe for upgrade

3.3 Time for Troubleshooting

  • Expect problems in ~20% of upgrades
  • Always plan several hours for fixing issues
  • Never upgrade a critical system without downtime window

3.4 Wait After Release (Best Practice)

  • Wait 1–2 weeks after a new Ubuntu release
  • Early bugs get fixed quickly
  • Ubuntu may even delay server upgrades until a stable point release

3.5 Third-Party Repositories

  • Check if all external repos support the new Ubuntu version
  • Unsupported repos cause:

    • dependency conflicts
    • broken packages
    • failed upgrades

This is a major risk factor.

3.6 Bootable Recovery Media (Desktop Systems)

  • Prepare a bootable Ubuntu USB
  • Make sure BIOS allows USB boot
  • Know disk encryption passwords

This allows you to recover data if the OS fails.

4. LTS vs Non-LTS (Very Important)

Check your current version:

lsb_release -a

Example:

Ubuntu 22.04 LTS

What does LTS mean?

  • Long Term Support
  • 5 years of security updates
  • Recommended for:

    • servers
    • production systems

Non-LTS versions:

  • Supported for 9 months only
  • Require frequent upgrades
  • Not recommended for servers

⚠️ Upgrading from LTS → non-LTS means losing long-term support

5. When Should You Upgrade?

System Type Recommendation
Production server Stay on LTS
Business-critical system Stay on LTS
Workstation / testing Optional
Learning / demo Fine

Always ask:

“What problem does this upgrade solve for me?”

6. Upgrade Preparation Steps

Step 1: Fully Update Current System

sudo apt update
sudo apt full-upgrade

This ensures:

  • latest bug fixes
  • clean dependency state

Step 2: Reboot (If Kernel Updated)

sudo reboot

Ensures new kernel is active.

Step 3: Install Upgrade Tool

sudo apt install update-manager-core

This provides:

  • do-release-upgrade

7. Running the Release Upgrade

Step 1: Start Upgrade

sudo do-release-upgrade

If no new LTS is available, you may see:

No new release found

Step 2: Allow Non-LTS Upgrades (If Needed)

Edit:

sudo nano /etc/update-manager/release-upgrades

Change:

Prompt=lts

to:

Prompt=normal

Save and exit.

Step 3: Run Upgrade Again

sudo do-release-upgrade

Ubuntu will:

  • check system
  • detect SSH connection
  • warn about risks
  • download ~1–2 GB
  • ask configuration questions

8. During the Upgrade

Configuration Prompts

Examples:

  • Keyboard layout
  • Console character set

Default choices are usually safe.

Obsolete Packages

You may be asked:

Remove obsolete packages?
  • Usually safe
  • Review list if system is critical

Configuration File Conflicts

Example:

Configuration file '/etc/crontab' has been modified

Options:

  • Install maintainer version
  • Keep your version
  • View differences

Use D to inspect differences before deciding.

9. Kernel Errors (High Risk)

Kernel issues are critical because:

  • kernel loads first at boot
  • failure = system won’t start

Causes:

  • unusual CPU architecture (ARM)
  • custom kernels
  • incompatible drivers

10. Final Reboot

sudo reboot

After reboot:

lsb_release -a

If successful, version is upgraded.

11. Real-World Outcome (Important Lesson)

In this case:

  • Kernel upgrade failed
  • System became unbootable

This is realistic and valuable:

  • Upgrades can fail
  • Backups matter
  • Recovery skills are required

Troubleshooting an Unbootable Ubuntu System (Real Incident Walkthrough)

1. Why This Failure Is a Good Thing

This is actually a perfect real-world example.

In real DevOps work:

  • Systems do break
  • Upgrades do fail
  • You rarely get a clean, predictable error

This is far more valuable than a “happy-path” demo.

2. Initial Situation: System Does Not Boot

Symptoms:

  • System powers on
  • Boot messages appear
  • Kernel starts loading
  • System hangs and never completes boot

This tells us:

  • Hardware is OK
  • Bootloader likely works
  • Failure happens during kernel boot

3. First Rule of Incident Response: Stay Calm

Before touching anything:

  • Accept the system is down
  • Inform stakeholders if needed
  • Stop rushing
  • Think logically

Stress causes bad decisions.
Calm fixes systems.

4. Isolating the Failure: Bootloader vs Kernel

Observations:

  • Bootloader menu appears
  • “Booting Linux kernel” message appears
  • No userspace logs appear

Conclusion:
👉 Kernel is failing, not the bootloader.

5. Best-Case Scenario: GRUB Menu Available

Because GRUB was enabled earlier, we could:

  1. Open Advanced options for Ubuntu
  2. Select an older kernel
  3. Boot successfully

Result:

  • System boots
  • Login works
  • Problem is confirmed: new kernel is broken

This immediately isolates the issue.

6. Worst-Case Scenario: GRUB Menu NOT Available

If GRUB was hidden (default on many systems):

  • You cannot select older kernels
  • System appears completely dead

Solution:

👉 Boot from a Live Linux system

7. Booting from a Live Linux System

What is a Live System?

  • Linux runs from USB/DVD
  • No changes written to disk
  • Full access to tools and terminal

Options:

  • Physical machine → USB or DVD
  • Virtual machine → attach ISO
  • Cloud server → provider “rescue mode”

8. Choosing the Correct Live Image

Important rules:

  • Desktop images usually include live mode
  • Server images often install immediately
  • Architecture must match your CPU

Special case (ARM systems):

  • ARM64 images are harder to find
  • Daily builds may be required
  • Older live images may work better

9. Booting the Live System

Steps:

  1. Attach ISO
  2. Set boot order (USB/DVD first)
  3. Restart system
  4. Choose “Try Ubuntu”

If errors appear:

  • Wait a few minutes
  • Many hardware warnings are harmless

10. First Priority: Data Access & Backup

Once live system is running:

  • Your installed system disk is mounted automatically
  • You can browse:

    • /home
    • /var/www
    • /var/lib/mysql
    • application data

👉 Even if repair fails, your data is safe

This alone is a major win.

11. Verifying File System Health

Before touching boot components, rule out disk corruption.

Read-only check (recommended):

sudo fsck /dev/sda2

Result:

  • No errors → filesystem is healthy
  • Problem is not disk-related

12. Accessing the Installed System via chroot

We need to work inside the broken system.

Step 1: Open terminal inside mounted system

Right-click → Open in Terminal

Step 2: Change root

sudo chroot .

What this does:

  • Does not boot the system
  • Redirects / to the installed OS
  • Commands now act as if system were running

This is critical for recovery.

13. Why Things Still Don’t Work Yet

Inside chroot, commands like:

update-grub

may fail with:

No such device

Why?

  • /dev, /proc, /sys are kernel-managed
  • They are missing inside chroot

14. Fixing Missing System Mounts (Critical Step)

We must bind system directories from the live kernel.

Bind mounts:

sudo mount --bind /dev /dev
sudo mount --bind /proc /proc
sudo mount --bind /sys /sys

Now the installed system can:

  • See disks
  • Detect kernels
  • Update bootloader properly

15. Rebuilding GRUB

Now run:

update-grub

This time:

  • Kernel entries are detected
  • Boot menu is regenerated correctly

16. Making the Working Kernel the Default

Step 1: Inspect GRUB menu entries

cat /boot/grub/grub.cfg

Find the exact menu entry of the working kernel.

Step 2: Set default kernel

Edit:

nano /etc/default/grub

Set:

GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.19.x"

(Exact text must match grub.cfg)

Step 3: Apply changes

update-grub

17. Reboot and Verify

Exit chroot:

exit

Restart system:

reboot

Result:

  • GRUB automatically selects working kernel
  • System boots normally
  • Login successful

18. Important Follow-Up (Next Lecture)

We are not done yet.

Next steps:

  • Prevent working kernel from being removed
  • Remove broken kernel safely
  • Lock kernel packages
  • Avoid repeat failure

👉 This is mandatory in production

Stabilizing the System After Recovery (Kernel Safety & Cleanup)

Our system is booting again, but recovery is not finished yet.

A recovered system is still fragile unless we prevent the same failure from happening again.

1. The Risk After Recovery

Right now:

  • The system boots only because an older kernel exists
  • If that kernel is removed → system becomes unbootable again

Common danger:

sudo apt autoremove

This command may silently remove old kernels if they are marked as auto-installed.

We must protect the working kernel.

2. Identify the Active (Working) Kernel

Check the running kernel:

uname -r

Example:

5.19.0-xx-generic

Kernel files live in:

/boot

Important files:

  • vmlinuz-<version> → kernel
  • initrd.img-<version> → initial RAM disk

These files must not disappear.

3. Find Which Package Owns the Kernel File

Linux package manager knows which package created each file.

Check kernel ownership:

dpkg -S /boot/vmlinuz-5.19.0-xx-generic

Output example:

linux-image-5.19.0-xx-generic: /boot/vmlinuz-5.19.0-xx-generic

This tells us:
👉 The kernel comes from this package

4. Mark the Working Kernel as Manually Installed

This is the most important protection step.

sudo apt install linux-image-5.19.0-xx-generic

Why this works:

  • Even if already installed, APT marks it as manually installed
  • autoremove will never delete it

APT logic:

  • Auto-installed → removable
  • Manually installed → protected

5. Why initrd.img Does Not Have a Package

You may notice:

dpkg -S /boot/initrd.img-5.19.0-xx-generic

returns nothing.

That is normal.

Reason:

  • initrd.img is generated dynamically
  • Created by post-install scripts of the kernel package

Verify:

sudo dpkg-reconfigure linux-image-5.19.0-xx-generic

You will see:

  • initramfs regeneration

As long as the kernel package stays installed → initrd stays too.

6. Removing the Broken Kernel (Optional but Recommended)

If a newer kernel breaks boot, remove it.

Step 1: Identify broken kernel package

dpkg -S /boot/vmlinuz-6.x.x-generic

Step 2: Remove dependent headers first

sudo apt remove linux-headers-generic

Step 3: Remove the broken kernel

sudo apt remove linux-image-6.x.x-generic

Why headers first?

  • Meta-packages depend on latest kernel
  • Removing headers breaks that dependency safely

⚠️ Do this only if you are sure the kernel is broken

7. Why linux-headers-generic Exists

This package:

  • Does not contain code
  • Always depends on the latest kernel

Installing it later:

sudo apt install linux-headers-generic

Will:

  • Pull the newest kernel again

Since the newest kernel caused failure:
👉 Do not reinstall it yet

8. Reboot and Verify Stability

Always test after kernel changes.

Reboot:

sudo reboot

Test:

  • Default boot entry
  • Advanced options → working kernel

If both work:
✅ System is stable again

9. Optional Cleanup: Reset GRUB Default

Now that broken kernel is gone:

  • First GRUB entry boots correctly
  • You may reset default behavior if desired

This is optional and not urgent.

10. Operational Best Practices (Real DevOps Advice)

A. Practice Recovery on Purpose

Create test failures:

  • Delete a boot file
  • Break GRUB config
  • Recover via live system

Practice makes incident response fast.

B. Servers Without Physical Access

In real servers:

  • Use provider rescue mode
  • SSH into recovery system
  • Use chroot only (no GUI)

Same logic — just CLI only.

C. Always Back Up Before Fixing

Even during rescue:

  • Copy /home
  • Copy /var
  • Copy application data

Never trust recovery until data is safe.

11. Common Causes of Boot Failures

Category Examples
Kernel incompatible kernel update
Bootloader broken GRUB config
Filesystem disk corruption
Packages broken third-party drivers
Hardware disk failure, overheating
Security firewall blocks SSH
Mounts /etc/fstab errors

Not all require live systems — boot failures do.

Cron Jobs in Linux — Concepts, Configuration, and Real Usage

1. Heads-Up: There Is More Than One Cron Implementation

Before working with cron, you need to know one important thing:

👉 Cron is not one single program.

Historically, multiple cron implementations evolved independently.
They all look similar, but they may differ slightly in:

  • features
  • defaults
  • supported syntax
  • email behavior

The concepts are the same, but details may vary.

2. What Is Cron?

Cron Daemon

  • Cron is a background service (daemon)
  • It wakes up every minute
  • Checks whether any scheduled jobs must run
  • Executes commands at predefined times

The name comes from Chronos, the Greek word for time.

3. Where Cron Jobs Are Stored

Cron reads multiple locations.

3.1 User-Specific Cron Jobs (Most Common)

Stored internally in:

/var/spool/cron/crontabs/
  • One file per user
  • Never edit these files directly
  • Permissions are intentionally restrictive

Correct way to manage them:

crontab -e

3.2 System-Wide Cron Jobs

Stored in:

/etc/crontab

Characteristics:

  • Editable directly
  • Must be owned by root
  • Must not be writable by group or others

Used mainly for system-level tasks.

3.3 /etc/cron.d (Debian / Ubuntu)

  • Directory containing cron job files
  • Often used by third-party software
  • You normally do not place your own jobs here
  • Cron loads every file in this directory

This is Debian/Ubuntu-specific behavior.

4. Editing a User Crontab

Open your crontab:

crontab -e
  • First time: you may be asked which editor to use
  • Editor choice is stored

Temporarily choose an editor:

EDITOR=vim crontab -e

or

EDITOR=nano crontab -e

View your crontab:

crontab -l

This is the only safe way to read it without root access.

5. Why You Should Never Edit Cron Files Directly

Even your own crontab:

/var/spool/cron/crontabs/<username>
  • Has strict permissions
  • Editing directly may:

    • break cron
    • corrupt format
    • change ownership

👉 Always use crontab -e

6. Crontab File Structure

A crontab has two parts:

  1. Optional environment variables
  2. One or more cron job definitions

7. Environment Variables in Crontab

These apply only to cron jobs, not your shell.

Common ones:

SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Why this matters:

  • Cron uses a minimal PATH
  • Many commands fail without a full PATH
  • Default shell may not be Bash

⚠️ Not all cron implementations support this
(works on Ubuntu/Debian)

8. Cron Job Syntax (Core Knowledge)

General Format:

MINUTE HOUR DAY MONTH DAY_OF_WEEK COMMAND
Field Range
Minute 0–59
Hour 0–23
Day 1–31
Month 1–12
Day of week 0–7 (Sun=0 or 7)

Example: Every day at 03:05

5 3 * * * command

9. Wildcards (*)

* means all possible values.

Example: every minute

* * * * * command

10. Redirecting Output (Very Important)

Cron runs without a terminal.

If you don’t redirect output:

  • Output may be emailed
  • Or silently discarded
  • Or logged elsewhere

Example:

* * * * * ping -c 1 google.com >> ~/ping.log
  • >> appends
  • Prevents overwriting

11. Testing a Cron Job

After saving your crontab:

  • Wait one minute
  • Check output file
cat ~/ping.log

If output appears → cron works.

12. Limiting Execution Frequency

Every hour at minute 0

0 * * * * command

Every 5 minutes

*/5 * * * * command

Runs at:

00, 05, 10, 15, 20, ...

Specific minutes

0,15,30,45 * * * * command

Hour range (08:00–20:00)

0 8-20 * * * command

Both ends included.

Every 2 hours

0 */2 * * * command

⚠️ This runs every minute during matching hours if minute is *.

Correct way:

0 */2 * * * command

Every 2 hours starting at 01:00

0 1-23/2 * * * command

13. Day of Week Filtering

Day of week acts as a filter.

Example: every Monday at midnight

0 0 * * 1 command

Values:

  • 0 or 7 = Sunday
  • 1 = Monday
  • 6 = Saturday

14. Combining Fields Carefully (Common Pitfall)

This:

* */2 * * * command

Means:

  • Every minute
  • During every second hour

Result:

  • Runs 60 times per active hour
  • Silent for the next hour

Often not what you want.

15. Practical Example

SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

0 * * * * ping -c 1 google.com >> ~/ping_hourly.log

Runs:

  • Once per hour
  • Clean output
  • Predictable behavior

16. Cron Implementations You Should Know

16.1 Vixie Cron

  • Default on Ubuntu/Debian
  • Package name: cron
  • Most common reference behavior

16.2 Anacron

  • Handles missed jobs
  • Runs jobs after system was offline
  • Used for:

    • daily
    • weekly
    • monthly tasks

Ubuntu:

  • Separate package

CentOS:

  • Integrated

16.3 Cronie (CentOS / RHEL / Fedora)

  • Fork of Vixie Cron
  • Includes Anacron
  • Same syntax
  • Slightly different defaults

17. Why This Matters in Real Life

Cron is used for:

  • backups
  • log rotation
  • monitoring
  • cleanup tasks
  • report generation
  • automation glue

Understanding:

  • timing
  • output
  • environment
  • implementation differences

is mandatory for DevOps and SysAdmins.

Cron Output, Email Notifications, and flock (Ubuntu)

Important
This lecture is Ubuntu-specific.
CentOS behaves differently and is covered in the next lecture.

1. What Happens If We Do NOT Redirect Cron Output?

So far, we always redirected output:

>> file.log

But what if:

  • we don’t redirect stdout, or
  • the command writes to stderr, or
  • the command fails?

Answer:

👉 Cron tries to send the output by email to the job’s owner.

2. Default Cron Mail Behavior on Ubuntu

  • Output is emailed to the local user
  • Email delivery requires a Mail Transfer Agent (MTA)
  • Ubuntu does not install one by default

So initially:

  • cron tries to send mail
  • mail fails silently
  • output is discarded

3. Demonstration: Cron Job With Output (No Redirect)

Edit crontab:

crontab -e

Add:

* * * * * ping google.com

This runs every minute and produces output.

Wait one minute.

4. Checking Cron Logs (systemd)

On Ubuntu, cron logs are handled by systemd.

Follow cron logs:

journalctl -u cron -f

You will see:

CRON: No mail transfer agent installed, discarding output

So:

  • cron executed the job
  • output existed
  • but email delivery failed

5. Installing Mail Support on Ubuntu

Cron does not send mail itself.
It delegates email delivery to an MTA.

Install mail support:

sudo apt install mailutils

This installs:

  • mail command
  • Postfix (mail transfer agent)

6. Postfix Configuration (Initial)

During install, choose:

General type of mail configuration: Local only

This means:

  • Mail is delivered only to local users
  • No internet delivery yet

Finish installation.

7. Where Local Cron Emails Are Stored

Local emails are stored as plain text files:

/var/mail/<username>

Example:

sudo cat /var/mail/youruser

You will see:

  • email headers
  • cron output body
  • one email per execution

👉 This is not internet email
👉 This is local system mail

8. Sending Cron Output to an External Email Address

Cron supports the MAILTO variable.

Edit crontab:

crontab -e

Add at the top:

Now cron will attempt to send output externally.

9. Why External Email Initially Fails

Even with MAILTO, emails may not arrive because:

  • Postfix is set to local only
  • External delivery is disabled

10. Reconfiguring Postfix for Internet Mail

Reconfigure postfix:

sudo dpkg-reconfigure postfix

Choose:

Internet Site

Accept defaults for:

  • system mail name
  • mailbox size
  • delivery method

This allows:

  • outbound email
  • internet mail delivery

11. Why Emails Go to Spam (This Is Normal)

Your VM:

  • has no valid domain
  • no SPF / DKIM
  • unknown sender reputation

Result:

  • Gmail usually accepts the mail
  • but places it in Spam

This is expected.

For production servers:

  • proper DNS
  • proper mail relay
  • trusted domain

12. How to Stop Cron Email Spam

Option 1: Redirect output

* * * * * command >> file.log 2>&1

Option 2: Send output to /dev/null

* * * * * command > /dev/null 2>&1

Option 3: Comment out job

# * * * * * command

13. Why flock Is Important (Real Production Problem)

Cron does not prevent overlap.

If a job runs every minute:

  • previous run may still be active
  • next run starts anyway
  • leads to:

    • database overload
    • duplicate jobs
    • race conditions

14. What flock Does

flock:

  • locks a file
  • only one process can hold the lock
  • others wait or exit

This allows mutual exclusion.

15. Simple flock Example

Terminal 1:

flock /tmp/test.lock ping google.com

Terminal 2:

flock /tmp/test.lock ping google.com

Result:

  • second command waits
  • runs only after first finishes

16. Non-Blocking flock (Cron-Safe)

Use:

flock -n /tmp/test.lock -c "command"

Behavior:

  • if lock exists → exit immediately
  • no overlap
  • cron sees success exit code

17. Why Exit Code Matters

Cron logic:

  • exit 0 = success
  • exit !=0 = error → email

Using:

flock -n file -c "cmd" || true

Ensures:

  • cron sees success
  • no spam
  • no duplicate execution

18. Real Production Cron Example

0 */2 * * * /usr/bin/flock -n /tmp/app.lock \
/usr/bin/php /var/www/app/artisan schedule:run

What this does:

  • runs every 2 hours
  • prevents overlap
  • skips execution if still running
  • safe for databases

19. Why Full Paths Are Used

Cron has a minimal PATH.

Best practice:

  • always use absolute paths

Find executable path:

which flock
which php

20. Why This Matters in Real Systems

Without flock:

  • multiple cron runs overlap
  • jobs collide
  • data corruption happens

With flock:

  • single execution guaranteed
  • predictable behavior
  • safe automation

System-Wide Cron Jobs and Anacron (Ubuntu)

Important

  • This lecture is Ubuntu-specific
  • CentOS / RHEL handle Anacron differently (covered separately)
  • Concepts are shared, implementations differ

1. User Cron Jobs vs System-Wide Cron Jobs

So far, we worked with user cron jobs:

crontab -e

Key points:

  • Affects only the current user
  • Stored internally in /var/spool/cron/crontabs/
  • Managed only via crontab command

Even this:

sudo crontab -e

still creates a user cron job — this time for the root user.

2. System-Wide Cron Jobs (/etc/crontab)

Ubuntu also supports system-wide cron jobs.

Location

/etc/crontab

Key differences

  • Regular text file
  • Edited directly (no crontab -e)
  • Owned by root
  • Writable only by root
  • Ignored if permissions are unsafe

3. Why System-Wide Cron Is Safe

Security model:

  • If someone can write /etc/crontab, they already have root
  • Root can already do anything
  • Cron ignores the file if permissions are wrong

So this is not a security risk.

4. System-Wide Cron Syntax

Unlike user crontabs, one extra field exists.

Format

MIN HOUR DAY MONTH DOW USER COMMAND

Example:

* * * * * alice echo "----" >> /home/alice/test.txt

This means:

  • Runs every minute
  • Executed as user alice
  • Writes to Alice’s home directory

5. Example: System-Wide Cron Job

Edit the file:

sudo nano /etc/crontab

Add:

* * * * * alice cd /home/alice && echo "----" >> test.txt

After one minute:

ls /home/alice

Result:

  • test.txt exists
  • File owned by alice
  • Command executed with Alice’s permissions

6. When to Use System-Wide Cron

Use user crontab when:

  • Job is personal
  • No root privileges needed
  • User manages their own tasks

Use system-wide cron when:

  • Job must run as a specific service user
  • Example users:

    • www-data
    • postgres
    • mysql
  • You don’t want to log in as that user

Example:

*/5 * * * * www-data php /var/www/app/artisan cleanup

7. Introducing Anacron (Why Cron Is Not Enough)

Regular cron jobs:

  • Run only if the system is running
  • Miss execution if the system is off
  • Ignore battery state

Anacron solves this.

8. What Is Anacron?

Anacron:

  • Executes jobs eventually
  • Designed for:

    • laptops
    • desktops
    • non-24/7 systems
  • Handles:

    • missed executions
    • delayed execution after reboot
    • power-state awareness

Typical use cases:

  • log cleanup
  • cache cleanup
  • maintenance tasks

9. When to Use Anacron

Use Anacron when:

  • Exact execution time does not matter
  • Task must run at least once
  • Delay is acceptable

Use cron when:

  • Exact timing matters
  • Task must run on schedule
  • Servers are always online

10. Anacron Job Directories (Ubuntu)

Ubuntu integrates Anacron via folders:

/etc/cron.daily/
/etc/cron.weekly/
/etc/cron.monthly/

How it works:

  • Place an executable file in the folder
  • Anacron executes it automatically

11. Filename Restrictions (Important)

Allowed characters only:

  • letters (A–Z, a–z)
  • digits (0–9)
  • underscore _
  • dash -

Invalid filenames are ignored.

12. Example: Daily Anacron Job

List daily jobs:

ls /etc/cron.daily

Example:

/etc/cron.daily/apache2

Open it:

sudo nano /etc/cron.daily/apache2

You’ll see a shell script:

  • executed once per day
  • used for maintenance

13. How Anacron Is Configured

Configuration file:

/etc/anacrontab

14. Anacrontab Syntax

Format:

PERIOD DELAY JOB-ID COMMAND

Example:

1   5   cron.daily   run-parts /etc/cron.daily

Meaning:

  • Period: every 1 day
  • Delay: wait 5 minutes after boot
  • ID: unique identifier
  • Command: execute all scripts in folder

15. Default Ubuntu Anacron Jobs

1   5    cron.daily
7   10   cron.weekly
@monthly 15 cron.monthly

This enables:

  • /etc/cron.daily
  • /etc/cron.weekly
  • /etc/cron.monthly

16. Why /etc/cron.hourly Exists

Check /etc/crontab:

17 * * * * root cd / && run-parts --report /etc/cron.hourly

This is:

  • normal cron, not Anacron
  • runs hourly
  • no power-state awareness

If the system is down → job is skipped.

17. Fallback Logic in /etc/crontab

You may see lines like:

25 6 * * * root test -x /usr/sbin/anacron || run-parts /etc/cron.daily

Meaning:

  • If Anacron exists → do nothing
  • If Anacron is missing → fallback to cron

This guarantees:

  • daily jobs still run
  • even without Anacron

18. Battery-Aware Execution (How It Works)

Check Anacron’s systemd unit:

systemctl cat anacron.service

You will see:

ConditionACPower=true

Meaning:

  • Anacron runs only when plugged in

19. Overriding Battery Behavior (Optional)

To override:

sudo systemctl edit anacron.service

Add:

[Unit]
ConditionACPower=

Now:

  • Anacron runs even on battery

20. Best Practices for Cron & Anacron

Scheduling

  • Avoid peak traffic hours
  • Distribute heavy jobs
  • Be timezone-aware

Logging & Monitoring

  • Always log output
  • Review logs regularly
  • Monitor after updates

Security

  • Run jobs with least privilege
  • Avoid root unless required
  • Secure scripts and dependencies
  • Never store secrets in crontab

Permissions

  • Cron-created files inherit user ownership
  • Match cron user with application user
  • Avoid permission mismatches

Testing

  • Test commands manually
  • Use absolute paths
  • Verify PATH differences
  • Monitor first executions

21. Cron Implementation Differences

Be aware:

  • Ubuntu / Debian → Vixie cron
  • CentOS / RHEL → Cronie
  • Features differ slightly
  • Environment variables may not work everywhere

When in doubt:

  • use shell scripts
  • use absolute paths
  • avoid assumptions

22. Final Takeaways

  • User cron ≠ system cron
  • /etc/crontab allows per-user execution
  • Anacron handles missed jobs
  • Ubuntu integrates Anacron via cron folders
  • Power state matters on laptops
  • Cron needs planning, logging, and discipline

What Is the Internet? (Big Picture)

Image

Image

Image

Definition

The Internet is a network of networks.

  • It is made of interconnected nodes (computers, routers, servers)
  • These nodes form a mesh, not a single direct line
  • Any node can communicate with almost any other node
  • No dedicated end-to-end connection is required

This design makes the Internet:

  • Scalable
  • Fault-tolerant
  • Efficient

Why the Internet Works Without Dedicated Connections

Imagine:

  • You are in Europe
  • You connect to a server in Australia

You do not have a physical cable to Australia.

Instead:

  • Data is split into small packets
  • Each packet is routed independently
  • Routers choose the best available path at that moment
  • If a link is congested or broken, traffic is rerouted automatically

This is called packet switching.

Important consequences:

  • Packets may take different paths
  • Paths can change dynamically
  • Packet order is not guaranteed (higher layers fix this)

Visualization: How Data Reaches Google

Image

Image

Image

Example Flow

  1. Your computer
  2. Home router
  3. ISP router
  4. Multiple intermediate routers (hops)
  5. Destination server (e.g., Google)

Each router:

  • Looks only at the destination address
  • Forwards the packet to the next best hop
  • Does not know the full path

Routers are often called hops because packets “hop” through them.

What Must Exist for Internet Communication to Work

To send data to google.com, several things must happen:

  1. Name resolution
  • Convert google.com → IP address
  • This is done by DNS
  1. Local delivery
  • Your computer must send data to the local router
  • Happens inside your local network
  1. Inter-network routing
  • Data must cross multiple networks
  • Handled by the IP protocol
  1. Reliability
  • Lost packets must be detected and retransmitted
  • Done by TCP
  1. Application communication
  • Web, SSH, email, etc.
  • Done by protocols like HTTP, HTTPS, SSH

This layered approach is intentional.

Working Bottom-Up (How This Chapter Is Structured)

We will study networking from the ground up:

  1. How data is placed on the wire or Wi-Fi
  2. How packets move inside a local network
  3. How packets move between networks (Internet)
  4. How reliability is guaranteed
  5. How applications use the network

This matches how real networking works.

Tool 1: The ip Command (Linux Networking)

What Is ip?

The ip command is the modern Linux networking tool.

It replaces:

  • ifconfig
  • route
  • netstat

It is:

  • More powerful
  • More accurate
  • Actively maintained

Showing Network Interfaces

ip address show

Output shows:

  • Network interfaces
  • IP addresses
  • Interface state
  • MAC addresses

Example interfaces:

  • lo → loopback (localhost)
  • eth0, ens33, wlp0s20f3 → physical or virtual NICs

Legacy Tool (Still Exists)

ifconfig -a
  • Older
  • Still available on some systems
  • Internally uses older kernel interfaces

In this course, we use ip.

macOS Users: Using ip via Homebrew

macOS does not ship with ip.

Options:

  • Use ifconfig (native)
  • Install an ip wrapper

Install via Homebrew

brew install iproute2mac

After installation:

ip address show

Notes:

  • Output may differ slightly
  • Not all features are supported
  • Good enough for learning

Tool 2: Wireshark (Traffic Analysis)

Image

Image

Image

What Is Wireshark?

Wireshark is a graphical packet analyzer.

It allows you to:

  • Capture live network traffic
  • Inspect packets layer by layer
  • Visualize real network behavior

This is critical for understanding, not just memorizing.

⚠️ Legal & Ethical Warning (Very Important)

Wireshark can:

  • Capture private data
  • Capture MAC addresses
  • Capture credentials (if unencrypted)

Rules:

  • Capture only traffic you own or are allowed to analyze
  • Laws vary by country
  • In some regions, MAC addresses are personal data

This lecture is for:

  • Learning
  • Teaching
  • Ethical debugging

This is not legal advice.

Installing Wireshark (Ubuntu)

sudo apt install wireshark

Start Wireshark with required privileges:

sudo wireshark

Capturing Traffic

Steps:

  1. Select network interface
  2. Start capture
  3. Generate traffic (open a website)
  4. Stop capture
  5. Analyze packets

Example:

  • Open google.com
  • Stop capture
  • Filter by protocol:
  http

Note:

  • Modern sites use HTTPS
  • Payload is encrypted
  • Metadata is still visible

Why Wireshark Matters

Wireshark shows:

  • Frames
  • Packets
  • Headers
  • Protocol layers

Right now this looks overwhelming — that’s expected.

By the end of this chapter:

  • Every section will make sense
  • Every field will have meaning

Introducing the OSI Model

Image

Image

What Is the OSI Model?

The OSI (Open Systems Interconnection) model is a conceptual framework.

Purpose:

  • Standardize network communication
  • Enable interoperability
  • Provide a shared troubleshooting language

Developed:

  • Concept in the 1970s
  • Formalized in the 1980s

Why the OSI Model Exists

Before standards:

  • Vendors used incompatible protocols
  • Networks could not interoperate

Today:

  • Any phone works on any Wi-Fi
  • Any laptop works on any router
  • Any OS can talk to any server

That did not happen by accident.

The 7 OSI Layers (Bottom → Top)

Layer Name Purpose
1 Physical Bits on wire (cables, signals)
2 Data Link Local delivery (MAC, Ethernet)
3 Network Routing between networks (IP)
4 Transport Reliability, ordering (TCP/UDP)
5 Session Session management
6 Presentation Encryption, compression
7 Application HTTP, SSH, FTP, SMTP

Key Layer Intuition

  • Layer 1: Is the cable plugged in?
  • Layer 2: Can I talk to my router?
  • Layer 3: Can packets reach the destination?
  • Layer 4: Are packets reliable?
  • Layer 7: Does the application work?

This is how real troubleshooting is done.

Why the OSI Model Is Useful for You

1. Modularity

Each layer can evolve independently.

Example:

  • TCP improvements do not break Ethernet
  • HTTPS encryption does not affect routing

2. Interoperability

Devices from different vendors work together.

3. Troubleshooting Framework

You can say:

  • “Layer 1 issue” → cable
  • “Layer 3 issue” → routing
  • “Layer 7 issue” → application

This saves hours in production.

OSI Layer 1 – The Physical Layer

Image

Image

Image

What Is the Physical Layer?

The physical layer (Layer 1) is the foundation of networking.

It is responsible for physically transmitting bits from one device to another.

This includes:

  • Ethernet cables (copper)
  • Fiber-optic cables
  • Wi-Fi radio signals
  • Electrical voltages and light pulses

At this layer, there is no concept of IP addresses, packets, or routing—only raw bits.

Responsibilities of Layer 1

Layer 1 handles:

  • Physical media

    • Copper, fiber, wireless
  • Signal transmission

    • Electrical voltage
    • Light pulses
    • Radio waves
  • Bit encoding

    • Converting 0s and 1s into signals
  • Timing & synchronization

  • Collision avoidance (basic mechanisms)

  • Basic error detection

    • e.g., parity bits

Important detail:

  • Signals are encoded so the average voltage is zero
  • This prevents electrical potential buildup between devices

Examples of Physical Layer Failures

Common Layer 1 problems:

  • Cable unplugged
  • Broken cable
  • Power missing
  • Faulty network card
  • Electromagnetic interference
  • Hardware malfunction

If Layer 1 fails, nothing above it can work.

Layer 1 Hardware Examples

  • Ethernet cables
  • Fiber cables
  • Wi-Fi antennas
  • Physical splitters (old Ethernet hubs)
  • Network Interface Cards (NICs)

Old Ethernet splitters literally connected wires together.
All devices shared the same electrical medium.

Influencing Layer 1 via Software

You cannot unplug a cable with software.

But you can:

  • Enable or disable a network interface

This effectively shuts down Layer 1 from the OS perspective.

Enabling / Disabling a Network Interface (Linux)

Step 1: Identify interfaces

ip addr show

Example interface names:

  • enp0s5 (modern, predictable)
  • eth0 (older style)
  • wlan0 (Wi-Fi)

Modern names are stable and tied to hardware location.

Step 2: Disable interface

sudo ip link set dev enp0s5 down

Result:

  • Interface exists
  • State becomes DOWN
  • No traffic flows

⚠️ Warning
If this interface is your SSH connection → you will disconnect immediately.

Step 3: Enable interface

sudo ip link set dev enp0s5 up

Connectivity returns.

Real Example: Remote Device Risk

On systems like:

  • Raspberry Pi
  • Remote servers
  • Cloud VMs

If you disable:

  • wlan0 (Wi-Fi)
  • eth0 (Ethernet)

Your remote session will drop.

Recovery requires:

  • Reboot
  • Physical access
  • Console access

This is a classic Layer 1 outage.

OSI Layer 2 – The Data Link Layer

Image

Image

Image

What Is Layer 2?

The Data Link Layer (Layer 2) handles local communication inside one network.

Key responsibilities:

  • Frame delivery
  • MAC addressing
  • Error detection (local)
  • Collision reduction
  • Traffic isolation

Layer 2 does not route between networks.

Typical Layer 2 Hardware

  • Switch
  • Bridge
  • Wireless Access Point (WAP)

Note:

  • A switch is hardware
  • A bridge is usually software
  • Functionally, they are similar

Switch vs Wireless Router (Important Distinction)

  • Switch / Access Point

    • Layer 2 only
    • No routing
  • Router

    • Layer 3 (and above)
    • Connects different networks

A Wi-Fi access point is essentially:

A Layer-2 switch with radio antennas

Why We Need Switches

Old Method: Shared Wire (Hub / Splitter)

  • All devices share one cable
  • Every frame reaches every device
  • Devices discard frames not meant for them
  • Collisions occur if multiple devices talk

Works for:

  • Few machines

Fails for:

  • Many machines
  • High traffic

How a Switch Solves This

A switch learns MAC addresses.

  • Each device has its own cable
  • Switch remembers:

    • MAC → port mapping
  • Frames are forwarded only where needed

Example:

  • PC A sends frame to PC B
  • Switch forwards frame only to PC B’s port
  • Other ports remain silent

Parallel Communication with a Switch

With a switch:

  • Multiple devices can transmit simultaneously
  • No shared collision domain
  • Massive performance improvement

This is why switches replaced hubs.

Transparency of a Switch (Very Important)

From the computer’s perspective:

  • It does not know a switch exists
  • It behaves as if:

    • It is directly connected to other devices

The switch is completely transparent.

This fact is critical when later learning about:

  • Routers
  • Network segmentation
  • Subnets

What Layer 2 Can and Cannot Do

Can do:

  • Send frames inside the same network
  • Reduce collisions
  • Isolate traffic

Cannot do:

  • Route between networks
  • Reach the Internet
  • Understand IP addresses

That is Layer 3.

Layer 1 vs Layer 2 Summary

Layer Purpose Example
Layer 1 Physical transmission Cable, Wi-Fi
Layer 2 Local delivery Switch, MAC

OSI Layer 3 – The Network Layer (IP, Routing, Subnets)

Image

Image

Image

Why Do We Need the Network Layer?

On Layer 2 (Data Link), we learned:

  • Frames are sent from one network card to another
  • Communication is limited to the local network
  • Switches are transparent
  • MAC addresses are local only

➡️ Problem
If two computers are not in the same network, Layer 2 is not enough.

That is why we need Layer 3 – the Network Layer.

What Changes on Layer 3?

Layer Unit Address Type Scope
Layer 2 Frame MAC address Local network only
Layer 3 Packet IP address Across networks

Key idea

  • Frames cannot be routed
  • Packets can be routed

Routing = forwarding data between networks

Packet Encapsulation (Very Important Concept)

When sending data:

  1. Application creates data
  2. Layer 3 wraps it into an IP packet
  3. Layer 2 wraps the packet into an Ethernet frame
  4. Frame is sent on the wire

At every router:

  • Frame is removed
  • Packet is inspected
  • Packet is wrapped into a new frame
  • Sent to the next hop

This happens extremely fast in hardware.

What Is a Network?

A network is a group of interconnected devices that can communicate.

Important network types

  • LAN (Local Area Network) Home, office, data center
  • WAN (Wide Area Network) Internet, country, continent

➡️ The Internet is a WAN made of many LANs

Routers and Gateways

Image

Image

Image

A router:

  • Connects networks
  • Operates on Layer 3
  • Forwards packets

A default gateway:

  • The router your computer sends packets to
  • Used when destination is outside your local network

Inspecting Network Configuration (Linux)

Show IP address

ip addr show

You will see:

  • Interface name
  • IP address
  • Subnet mask (CIDR notation)

Example:

192.168.1.23/24

Show routing table

ip route show

Example:

default via 192.168.1.1 dev enp0s5

Meaning:

  • Anything not local → send to 192.168.1.1
  • That IP is your router / gateway

Local vs Internet Traffic (Wireshark Proof)

Image

Image

Case 1: Ping Google (Internet)

  • IP packet destination = Google IP
  • Ethernet frame destination = router MAC
  • Router forwards packet

Case 2: Ping local machine

  • IP packet destination = local IP
  • Ethernet frame destination = target MAC
  • Router not involved (acts only as switch if Wi-Fi)

➡️ Same IP protocol
➡️ Different Layer-2 destination

Why Frames Are Addressed Differently

Destination Ethernet Frame Goes To
Same network Target device MAC
Different network Router MAC

This decision is made using the subnet mask.

Subnets – Networks Inside Networks

Image

Image

Image

What Is a Subnet?

A subnet is a logical subdivision of a network.

Used to:

  • Reduce broadcast traffic
  • Improve performance
  • Scale large networks
  • Control routing

The Problem Subnets Solve

Your computer must answer:

Is the destination IP local or remote?

If local → send frame directly
If remote → send frame to gateway

Subnet Mask (Core Concept)

Example:

IP address:     192.168.1.10
Subnet mask:    255.255.255.0
CIDR:           /24

Subnet mask:

  • Defines network part
  • Defines host part

Binary AND Logic (Conceptual)

  • Subnet mask has:

    • 1 = network bits
    • 0 = host bits

Logical AND:

  • Keep bits where mask = 1
  • Zero out bits where mask = 0

If:

(network part of source)
==
(network part of destination)

➡️ Same subnet

Otherwise ➡️ send to gateway

Computers do this instantly in hardware.

CIDR Notation (Short Form)

Instead of:

255.255.255.0

We write:

/24

Why?

  • 24 bits = 1
  • Remaining bits = 0

Examples:

CIDR Hosts (usable)
/24 254
/23 510
/22 1022
/30 2 (point-to-point)

Reserved Addresses in a Subnet

For /24:

  • .0 → network address
  • .255 → broadcast address
  • .1 – .254 → usable hosts

Inspecting Subnet Mask on Linux

ip addr show

Example output:

inet 192.168.1.23/24

This tells you:

  • Your IP
  • Your subnet size
  • How routing decisions are made

Key Mental Model (Very Important)

  • IP packet → logical destination
  • Ethernet frame → physical next hop
  • Subnet mask → decision maker
  • Router → network boundary

How Does a Computer Know Where to Send a Frame?

(ARP, IP ↔ MAC Resolution, Routes, DHCP)

Image

Image

Image

The Core Question

We know IP packets contain destination IPs
We know Ethernet frames need destination MACs

How does the system know which MAC address to use?

That is the job of ARP.

Packet vs Frame (Quick Reminder)

Layer Unit Address Used
Layer 3 Packet IP address
Layer 2 Frame MAC address

To send any IP packet, the system must:

  1. Decide where the packet should go
  2. Resolve which MAC address to send the frame to

ARP – Address Resolution Protocol

ARP answers one question only:

“Which MAC address owns this IP address?”

ARP works only inside the local network.

What Happens When You Ping a Local Machine

Image

Image

Image

Step-by-step

  1. You run:
   ping 192.168.1.50
  1. System checks subnet mask → Destination is local
  2. System does ARP request (broadcast):
   Who has 192.168.1.50?
  1. All devices receive it
  2. Only the owner replies:
   192.168.1.50 is at AA:BB:CC:DD:EE:FF
  1. MAC address is cached
  2. Ethernet frame is sent directly to that MAC

What Happens When You Ping the Internet

Image

Image

Image

Step-by-step

  1. You run:
   ping google.com
  1. DNS resolves IP (e.g. 142.250.x.x)
  2. Subnet mask check → Destination is NOT local
  3. System needs gateway MAC
  4. ARP request:
   Who has 192.168.1.1?
  1. Router replies with its MAC
  2. Ethernet frame destination = router MAC
  3. Router forwards packet

➡️ IP destination stays Google
➡️ MAC destination is router

Why ARP Is Always Happening

ARP traffic is normal and frequent:

  • Devices announce themselves
  • Devices verify IP conflicts
  • Routers refresh mappings

In Wireshark:

ARP Who has 192.168.1.X?
ARP 192.168.1.X is at …

This is normal network noise, not a problem.

Changing IP Addresses Manually (Linux)

Show current IPs

ip addr show

Add a secondary IP

sudo ip addr add 192.168.1.232/24 dev enp0s5

Now the interface has two IPs.

✔️ Other devices in the same subnet can reach it
❌ Devices outside the subnet will not

Remove the IP

sudo ip addr del 192.168.1.232/24 dev enp0s5

Why Adding “Any IP” Doesn’t Work

You can add:

8.8.8.8/24

But:

  • Other machines see it as Internet IP
  • Frames go to the router
  • Router does NOT send traffic back to your PC

➡️ IP must match subnet logic, not just syntax.

Routing Table – How the OS Decides Paths

Image

Image

Image

Show routes

ip route show

Typical output:

192.168.1.0/24 dev enp0s5
default via 192.168.1.1 dev enp0s5

Meaning:

  • Local network → direct
  • Everything else → router

Ask the OS how it would reach an IP

ip route get 8.8.8.8

or

ip route get 192.168.1.50

You will see:

  • Interface used
  • Gateway (if any)

Manually Adding Routes (Advanced)

Example: Send traffic to wrong gateway

sudo ip route add 9.9.9.9/32 via 192.168.1.100 dev enp0s5

Result:

  • Packet sent to wrong device
  • Device does not forward
  • Traffic fails

Remove it:

sudo ip route del 9.9.9.9/32

Why Manual Routes Matter (Corporate Networks)

Image

Image

Image

Example:

  • Subnet A: 192.168.1.0/24
  • Subnet B: 192.168.2.0/24
  • Router in between

Without route:

  • Subnet A cannot reach Subnet B

With route:

sudo ip route add 192.168.2.0/24 via 192.168.1.5 dev enp0s5

Now traffic flows.

DHCP – How Devices Get IPs Automatically

Image

Image

Image

What Is DHCP?

Dynamic Host Configuration Protocol

Automatically assigns:

  • IP address
  • Subnet mask
  • Gateway
  • DNS servers
  • Lease time

Usually runs on the router.

DHCP 4-Step Process (DORA)

  1. Discover (broadcast)
  2. Offer
  3. Request
  4. Acknowledge

All initial messages are broadcast.

Seeing DHCP in Wireshark

Filter:

dhcp

You’ll see:

  • Client MAC
  • Offered IP
  • Lease duration
  • Gateway
  • DNS servers

This is your entire network configuration being delivered.

Debugging DHCP (systemd-networkd)

View logs

journalctl -u systemd-networkd

Look for:

  • DHCP lease acquired
  • DHCP lease lost
  • Link up/down events

If:

  • Cable is plugged
  • Wi-Fi is connected
  • But no IP

➡️ It’s almost always DHCP

Final Mental Model (Very Important)

To send data:

  1. Subnet mask decides:
  • Local → direct MAC
  • Remote → gateway MAC
    1. ARP resolves MAC
    2. Ethernet frame is built
    3. Router forwards if needed
    4. Routing table controls decisions
    5. DHCP provides configuration

NetworkManager vs systemd-networkd (DHCP Logs)

Image

Image

Image

Why Different Linux Systems Use Different Network Tools

Not all Linux distributions manage networking the same way.

  • Ubuntu Server → usually uses systemd-networkd

Both tools:

  • Configure interfaces
  • Run a DHCP client
  • Assign IP addresses
  • Manage routes

They just do it differently.

Why NetworkManager Exists

NetworkManager is a more integrated solution.

It supports:

  • DHCP
  • DNS integration
  • Wi-Fi
  • VPNs
  • Mobile connections

With systemd:

  • Networking is split across multiple components

    • systemd-networkd
    • systemd-resolved
    • others

➡️ Recommendation
Use the default tool provided by your distribution unless you have a strong reason to change it.

Inspecting DHCP Logs with NetworkManager

On systems using NetworkManager (e.g. CentOS):

sudo journalctl -u NetworkManager --boot

What you’ll see:

  • Service startup
  • Interface detection
  • DHCP requests
  • DHCP lease assignment
  • Lease renewals

Understanding the Logs

Key events you’ll notice:

  • Interface appears
  • DHCP client starts
  • IP address is assigned
  • Subnet mask, gateway, DNS received
  • Lease renewals every few minutes

Lease Renewal Explained

DHCP leases expire unless renewed.

The client periodically tells the router:

“I’m still here. Please keep my IP.”

This prevents:

  • IP conflicts
  • Stale reservations

Key Takeaway

Regardless of tool:

  • A DHCP client must exist
  • It must talk to a DHCP server
  • Logs tell you why networking works or fails

If:

  • Cable is connected
  • Wi-Fi is up
  • But no IP

➡️ Check DHCP logs first

Ping – The ICMP Diagnostic Tool

Image

Image

Image

What Is Ping?

ping is a Layer-3 diagnostic tool.

It uses ICMP (Internet Control Message Protocol).

What ping does:

  1. Sends ICMP Echo Request
  2. Waits for ICMP Echo Reply
  3. Measures round-trip time

Important Warning About Ping

If ping fails, it does NOT always mean:

  • The host is down

It may mean:

  • ICMP blocked by firewall
  • ICMP disabled on destination
  • ICMP filtered in between

Ping tests reachability, not availability.

Seeing Ping in Wireshark

Filter:

icmp

You will see:

  • Echo request
  • Echo reply
  • Sequence numbers
  • Identifiers

Round-trip time (RTT):

  • Gives latency estimate
  • Wi-Fi fluctuates more than Ethernet

When Ping Is Useful

Ping helps answer:

  • Can I reach the host?
  • Is the network slow?
  • Is there packet loss?

Ping does not:

  • Test application health
  • Guarantee connectivity beyond ICMP

Traceroute – Tracing the Path

Image

Image

Image

What Traceroute Does

Traceroute shows:

  • Each router (hop) between source and destination
  • Latency per hop
  • Where delays appear

Command:

traceroute google.com

Output:

  • Hop number
  • Router IP or hostname
  • Three RTT measurements

Why Three Measurements?

Network latency fluctuates.

Traceroute sends multiple probes to:

  • Detect instability
  • Avoid false conclusions

Interpreting Traceroute Output

Typical observations:

  • First hop = your gateway
  • ISP routers next
  • Backbone routers later
  • Destination last

* * * means:

  • Router didn’t reply
  • ICMP blocked
  • Still forwarding traffic

Long-Distance Latency Example

When tracing overseas destinations:

  • Sudden jump (e.g. 30 ms → 250 ms)
  • Caused by:

    • Physical distance
    • Speed of light
    • Undersea fiber cables

This is why:

  • Global services deploy servers near users

How Traceroute Actually Works (TTL Explained)

Image

Image

Image

The TTL Field

Every IP packet has:

TTL – Time To Live

Purpose:

  • Prevent infinite routing loops

Traceroute Algorithm (Simplified)

  1. Send packet with TTL = 1
  2. First router:
  • Decrements TTL → 0
  • Drops packet
  • Sends ICMP Time Exceeded
    1. Record router IP
    2. Increase TTL to 2
    3. Repeat until destination reached

➡️ Each step discovers one hop

Why ICMP Appears in Captures

When a router replies:

  • It embeds the original IP packet
  • Inside an ICMP message
  • Inside a new IP packet
  • Inside a new Ethernet frame

That’s why Wireshark still matches filters.

Why Traceroute Isn’t a Single Packet

Each hop is:

  • A separate probe
  • Sent independently

Traceroute assumes:

  • Routing path remains stable

Real-World Insight: Traceroute Reveals Topology

Traceroute can show:

  • Multiple routers in your home network
  • ISP router + your own router
  • Hidden subnets

Example:

  • ISP router (mandatory)
  • Personal router behind it
  • Double NAT
  • Multiple internal subnets

Traceroute exposes this.

Final Layer-3 Summary

You now understand:

  • DHCP (automatic configuration)
  • ARP (IP → MAC resolution)
  • Routing tables
  • Default gateways
  • Ping (ICMP reachability)
  • Traceroute (path discovery)
  • TTL mechanics
  • Multi-subnet routing

At this point, you have solid Layer-3 knowledge, exactly what:

  • DevOps engineers
  • Cloud engineers
  • SREs

must understand deeply.

Transport Layer (Layer 4) — Why We Need It

Image

Image

Image

So far, Layer 3 (IP) solved routing across networks.
But IP alone has serious limitations:

Problems at Layer 3

  • Packets can be lost
  • Packets can be dropped (very common)
  • Packets can arrive out of order
  • No retransmission
  • No flow control
  • No congestion control

Routers intentionally drop packets when overloaded — this is normal and expected.

➡️ Layer 4 exists to handle these problems

UDP vs TCP — Two Different Philosophies

Image

Image

Image

UDP — “Send and Forget”

UDP (User Datagram Protocol):

  • No retransmission
  • No ordering
  • No congestion control
  • No connection setup

Why Use UDP?

Because sometimes retransmission is worse than packet loss.

Examples:

  • Video calls
  • Live streaming
  • Online gaming
  • DNS
  • NTP (time sync)

If a video frame arrives late → it’s already useless
➡️ Better to drop it and move on

Applications using UDP usually:

  • Send extra data
  • Use error correction
  • Handle loss themselves

TCP — Reliable Data Stream

TCP (Transmission Control Protocol) provides:

  • Reliable delivery
  • Ordered data
  • Retransmission
  • Flow control
  • Congestion control

Applications see TCP as a continuous stream, not packets.

What TCP Manages for You

  • Lost packets → retransmitted
  • Out-of-order packets → reordered
  • Receiver overload → sender slows down
  • Network congestion → speed reduced automatically

➡️ Applications don’t need to care about packet loss.

TCP Internals (High-Level, Practical View)

Image

Image

Image

Each TCP segment contains:

  • Source port
  • Destination port
  • Sequence number
  • Acknowledgment number
  • Flags (SYN, ACK, FIN, RST)
  • Checksum
  • Payload (data)

Sequence Numbers

Used to:

  • Order packets
  • Detect missing data
  • Acknowledge received bytes

TCP does not count packets — it counts bytes.

TCP Three-Way Handshake (Connection Setup)

Image

Image

Image

Before data transfer, TCP builds a connection:

Step 1 — SYN

Client → Server

  • SYN flag set
  • Initial Sequence Number (ISN)

Step 2 — SYN-ACK

Server → Client

  • SYN + ACK flags
  • Server’s ISN
  • Acknowledges client’s ISN

Step 3 — ACK

Client → Server

  • ACK flag
  • Acknowledges server’s ISN

➡️ Connection is now established

After this:

  • Data can flow both ways
  • Every byte is acknowledged

Seeing the Handshake in Wireshark

When using tools like wget or a browser:

  • You will see:

    • SYN
    • SYN-ACK
    • ACK
  • Followed by normal data packets

This knowledge is critical for:

  • Debugging
  • Firewall troubleshooting
  • Port scanning (next topic)

Ports — How Applications Are Identified

Image

Image

Image

Ports are Layer 4 identifiers.

  • Range: 0 – 65535
  • TCP and UDP have separate port spaces

A connection is uniquely identified by:

Source IP + Source Port + Destination IP + Destination Port

Port Categories

1. Well-Known Ports (0–1023)

Require root privileges.

Examples:

  • 80 → HTTP
  • 443 → HTTPS
  • 22 → SSH
  • 21 → FTP
  • 25 → SMTP

2. Registered Ports (1024–49151)

Assigned to common services.

Examples:

  • 3306 → MySQL
  • 5432 → PostgreSQL
  • 5900 → VNC

3. Dynamic / Ephemeral Ports (49152–65535)

Used by clients:

  • Randomly chosen
  • Temporary
  • No special privileges required

Source Port vs Destination Port

Example:

  • Client opens random high port (e.g. 46062)
  • Server listens on well-known port (e.g. 80)

Server response:

  • Source port = 80
  • Destination port = 46062

This allows thousands of simultaneous connections.

Common TCP and UDP Ports (Overview)

Image

Image

Image

Common TCP Ports

  • 80 → HTTP
  • 443 → HTTPS
  • 22 → SSH
  • 21 → FTP
  • 25 → SMTP
  • 110 → POP3
  • 143 → IMAP

Common UDP Ports

  • 53 → DNS
  • 67 / 68 → DHCP
  • 123 → NTP
  • 161 / 162 → SNMP
  • 69 → TFTP
  • 5004 / 5005 → RTP (audio/video)

Why UDP here?

  • Low latency
  • No retransmission delays

Port Scanning — Understanding Nmap

Image

Image

Image

What Is Port Scanning?

Port scanning tries to:

  • Connect to many ports
  • Observe responses
  • Determine which services are reachable

Possible Responses

  • SYN-ACK → Port open
  • RST → Port closed
  • No response → Port filtered (firewall)

Legal & Ethical Warning (Important)

  • Port scanning is a reconnaissance technique
  • Often used by attackers
  • Also used by defenders

⚠️ Only scan systems you own or are authorized to scan

Laws vary by country — never assume legality

Nmap Basics

Install:

sudo apt install nmap
# or
sudo dnf install nmap

Basic scan:

sudo nmap localhost

Scans:

  • Top 1000 TCP ports

Scan Specific Ports

sudo nmap -p 22,80,443 192.168.1.10

Scan All Ports

sudo nmap -p- 192.168.1.10

Scan a Network Range

sudo nmap 192.168.1.1-100

Useful for:

  • Inventory
  • Firewall validation
  • Security hardening

Practical Security Use Case

Port scanning helps you:

  • Detect unnecessary services
  • Close unused ports
  • Reduce attack surface

Example:

  • MySQL open on all interfaces
  • Not needed externally
  • Disable service or firewall it
sudo systemctl stop mysql
sudo systemctl disable mysql

Re-scan:

sudo nmap localhost

➡️ Security improved

Why This Matters for DevOps & Cloud

You now understand:

  • TCP vs UDP tradeoffs
  • Ports & services
  • Connection establishment
  • How attackers discover services
  • How defenders harden systems

This knowledge is mandatory for:

  • Firewalls
  • Kubernetes networking
  • Load balancers
  • Cloud security groups
  • Incident response

Advanced Nmap Scan Types — Why Scan Type Matters

Image

Image

Image

Not all port scans behave the same way.
Scan type directly affects:

  • Speed
  • Detectability
  • Logging on the target
  • Legal and operational risk

This is why an Nmap introduction is incomplete without scan types.

1. TCP SYN Scan (-sS) — Stealth Scan

What It Does

  • Sends SYN packet only
  • Waits for response
  • Does NOT complete the handshake

Responses

Response Meaning
SYN-ACK Port open
RST Port closed
No reply Port filtered (firewall)

Why It’s Fast

  • Only one packet per port
  • No full TCP connection
  • Minimal resource usage

Requirements

  • Root privileges (raw sockets)
sudo nmap -sS localhost

Logging Behavior

  • Often not logged
  • No established connection
  • Lower detection probability

➡️ Default and preferred scan if available

2. TCP Connect Scan (-sT) — Full Connection Scan

Image

Image

Image

When Is It Used?

  • When SYN scan is not possible

    • No root access
    • IPv6 scanning
    • Restricted environments

What It Does

  • Performs full TCP handshake

    • SYN → SYN-ACK → ACK
  • Uses OS networking stack

nmap -sT localhost

Downsides

  • Slower (extra packets)
  • Uses OS resources
  • Almost always logged
  • Can trigger alerts or IDS
  • May stress poorly written services

➡️ High visibility scan — use carefully

3. UDP Scan (-sU) — Slow but Necessary

Image

Image

Image

Why UDP Is Hard to Scan

  • No handshake
  • No ACKs
  • Packet loss is normal

How Nmap Interprets Responses

Response Meaning
UDP reply Port open
ICMP error Port closed
No reply Open or filtered
sudo nmap -sU localhost

Important Notes

  • Extremely slow
  • Requires retries
  • Often inconclusive

But still critical because:

  • DNS
  • DHCP
  • NTP
  • SNMP
  • RTP

➡️ TCP scans alone are not enough

Why Nmap Matters for Firewalls

Nmap answers:

  • What services are exposed?
  • Which ports must be blocked?
  • Did my firewall work?

Security hardening workflow:

  1. Scan
  2. Identify unnecessary services
  3. Stop or firewall them
  4. Re-scan to verify

Network Address Translation (NAT)

Image

Image

Image

Why NAT Exists

  • IPv4 address shortage
  • Many internal devices → one public IP

Internal IPs (Private)

  • 192.168.0.0/16
  • 10.0.0.0/8
  • 172.16.0.0/12

Not routable on the Internet.

How NAT Works (Outbound)

  1. Internal device sends packet
  2. Router:
  • Rewrites source IP
  • Often rewrites source port
    1. Router remembers mapping
    2. Reply arrives
    3. Router reverses translation

➡️ Router maintains a NAT table

Why Inbound Traffic Fails by Default

Incoming traffic:

  • Router has no idea where to send it
  • Packet is dropped

➡️ NAT works outbound only

Port Forwarding — Allowing Inbound Access

Image

Image

Image

Example:

  • External: :80
  • Internal: 192.168.1.50:8080

Router rewrites:

  • Destination IP
  • Destination port

DHCP Reservation — Critical Step

Problem:

  • Internal IPs change

Solution:

  • Bind MAC → IP
  • Prevent forwarding breakage

Always reserve IPs for:

  • Servers
  • NAS
  • Home labs

Dynamic Public IP Problem

Most ISPs:

  • Assign dynamic IPs
  • Change periodically

Solution:

  • Dynamic DNS (DDNS)

Router updates DNS record automatically:

myhome.ddns-provider.com → current public IP

⚠️ Not production-grade

  • DNS propagation delays
  • ISP NAT (CGNAT) may block inbound access entirely

OSI Layer 5 — Session Layer

Image

Image

Image

Purpose

  • Establish
  • Maintain
  • Terminate sessions

Adds:

  • State
  • Authentication
  • Session tracking

Examples

  • Network File Systems
  • Remote Procedure Calls (RPC)
  • Session-aware protocols

Modern reality:

  • Often implemented inside applications
  • Layers 5–7 are frequently merged

OSI Layer 6 — Presentation Layer

Image

Image

Image

Responsibilities

  • Data format
  • Encoding
  • Encryption
  • Compression

Common Functions

Encoding

  • ASCII
  • UTF-8 / Unicode

Encryption

  • SSL / TLS
  • HTTPS

Compression

  • gzip
  • deflate
  • brotli

MIME — Real Example

Emails require:

  • Character encoding
  • Attachments
  • HTML + plain text
  • Metadata

MIME defines:

  • How data is represented
  • Not how it’s transported

➡️ Transport (SMTP) ≠ Representation (MIME)

Modern Reality of OSI Layers

Important truth:

  • OSI is a conceptual model
  • Real protocols blur boundaries

Example:

  • HTTP/3 (QUIC)

    • Uses UDP
    • Implements encryption
    • Handles congestion control
    • Manages sessions internally

➡️ One protocol can span multiple OSI layers

Final Takeaways

You now understand:

  • Why scan type matters in Nmap
  • SYN vs Connect vs UDP scans
  • NAT behavior and limitations
  • Port forwarding & DHCP reservations
  • Why inbound traffic fails by default
  • How higher OSI layers overlap in reality

This knowledge is essential for:

  • Firewall configuration
  • Cloud networking
  • Kubernetes ingress
  • Security audits
  • Incident response

OSI Layer 7 — Application Layer

Image

Image

Image

The application layer (Layer 7) consists of protocols used by applications, not the applications themselves.

Important distinction

  • Firefox / Chrome / Outlook → applications (software)
  • HTTP, HTTPS, IMAP, SSH → application-layer protocols

Example:

  • Firefox uses HTTPS
  • Thunderbird uses IMAP
  • Terminal uses SSH

According to the OSI model, the protocol is Layer 7, not the program you click.

Common Layer 7 Protocols

Protocol Purpose
HTTP / HTTPS Web access
IMAP Access emails on server
POP3 Download emails
SMTP Send emails
SSH Remote shell, file transfer
FTP / SFTP File transfer
DNS Name resolution
Custom APIs REST, gRPC, proprietary

Layer 7 protocols depend on all lower layers:

  • Reliable transport (TCP/UDP)
  • Routing (IP)
  • Switching (Ethernet)
  • Physical transmission (bits)

DNS — Domain Name System (Layer 7)

Image

Image

Image

DNS is an application-layer protocol that converts human-readable names into IP addresses.

Example:

google.com → 142.250.x.x

Why DNS Exists

Humans remember names better than IP addresses.
Computers require IP addresses to communicate.

DNS bridges that gap.

DNS Resolution Flow (Step by Step)

1. Browser Cache

The browser checks:

  • “Have I resolved this domain recently?”

If yes → use cached IP.

2. Operating System Cache

If browser cache misses:

  • Browser asks the OS
  • OS checks its DNS cache

If found → return IP.

3. DNS Resolver (ISP or Custom)

If OS cache misses:

  • OS queries a DNS resolver
  • Usually provided by ISP or configured manually (e.g. 8.8.8.8)

Resolvers also cache results heavily.

Full Recursive Resolution (If Not Cached)

Image

Image

Image

4. Root Name Servers

  • Resolver queries one of 13 root servers (A–M)
  • Root servers do not know google.com
  • They know who manages .com

Response:

Ask the .com TLD servers

5. TLD Name Servers (.com)

  • Resolver queries .com TLD servers
  • TLD servers respond:
Ask Google’s authoritative name servers

6. Authoritative Name Servers

  • Resolver queries ns1.google.com
  • Gets final DNS records:
google.com → IP addresses

7. Response Propagation

  • Resolver → OS
  • OS → Browser
  • Browser connects to IP

✅ DNS resolution complete.

Common DNS Record Types

Image

Image

Image

Record Purpose
A Domain → IPv4
AAAA Domain → IPv6
CNAME Alias to another domain
MX Mail servers
NS Authoritative name servers
TXT Verification, SPF, DKIM, metadata

Viewing DNS Records

host -a google.com

Example output:

  • IPv4 + IPv6 addresses
  • Name servers
  • MX records
  • TXT verification records

Why IPs Change?

  • Load balancing
  • High availability
  • Different users → different servers

Manual DNS Resolution (Bonus — Deep Understanding)

Image

Image

Image

Using dig shows exactly what DNS is doing.

Step 1 — Query Root Server

dig @a.root-servers.net com NS

Returns:

  • List of .com TLD servers

Step 2 — Query TLD Server

dig @a.gtld-servers.net google.com NS

Returns:

  • Google’s authoritative name servers

Step 3 — Query Authoritative Server

dig @ns1.google.com google.com A

Returns:

  • Final IP addresses

This demonstrates DNS hierarchy and delegation clearly.

DNS Security Problems

DNS was designed before modern security threats.

Common Risks

  • DNS spoofing
  • Cache poisoning
  • Man-in-the-middle
  • ISP or government manipulation

Why HTTPS Matters

Image

Image

Image

Even if DNS is spoofed:

  • HTTPS verifies server identity
  • Invalid certificate → browser warning

This mitigates DNS attacks, though it does not fix DNS itself.

DNSSEC (Mention Only)

  • Cryptographic DNS signatures
  • Protects integrity of DNS data
  • Complex, not universally deployed

/etc/hosts — Manual DNS Override

Image

Image

Image

File:

/etc/hosts

Format:

IP_ADDRESS   hostname

Example:

127.0.0.1   myproject.local

Why Use /etc/hosts

  • Local development
  • Testing
  • Offline resolution
  • Temporary overrides

⚠️ Overrides DNS entirely

Example

sudo nano /etc/hosts

Add:

127.0.0.1   myproject.local

Test:

ping myproject.local

Overriding Public Domains (Not Recommended)

127.0.0.1 google.com

Result:

  • Browser connects to localhost
  • HTTPS fails (certificate mismatch)

Useful only for testing or demos.

DNS Cache Issues & Fixes

Sometimes /etc/hosts changes don’t apply immediately.

Reason:

  • Local DNS caching

Identify Local DNS Resolver

sudo lsof -i :53

Common:

  • systemd-resolved
  • dnsmasq

Flush DNS Cache (systemd)

sudo resolvectl flush-caches

Verify:

resolvectl statistics

Restart dnsmasq (if used)

sudo systemctl restart dnsmasq

Final Takeaways

You now understand:

  • OSI Layer 7 vs real applications
  • DNS resolution hierarchy
  • DNS record types
  • Manual DNS resolution with dig
  • DNS security weaknesses
  • HTTPS mitigation
  • /etc/hosts overrides
  • DNS cache flushing

This knowledge is critical for:

  • DevOps debugging
  • Kubernetes services
  • Load balancers
  • Cloud networking
  • Incident response

Hostnames in a Local Network

Image

Image

Image

A hostname is a human-readable name assigned to a computer on a network.

Why hostnames exist

  • Easier identification of devices (e.g. ubuntu, raspberrypi)
  • Used during DHCP negotiation
  • Displayed on routers (device lists)
  • Allows hostname-based access inside local networks

Viewing the Hostname (Linux)

hostname

Example output:

ubuntu

Some shells show it automatically in the prompt, but this is configurable.

Using Hostnames in a Local Network

From another machine:

ping ubuntu
ping ubuntu.local

If hostname resolution is configured correctly, the hostname resolves to an IP.

Changing the Hostname (Linux)

Step 1 — Edit /etc/hostname

sudo nano /etc/hostname

Example:

vm-ubuntu

Step 2 — Reboot (required)

sudo reboot

After reboot:

hostname

Output:

vm-ubuntu

Why /etc/hosts Must Also Be Updated

Image

Image

Image

The hostname should resolve locally to the loopback interface.

Edit:

sudo nano /etc/hosts

Correct example:

127.0.1.1   vm-ubuntu
127.0.0.1   localhost

Why 127.0.1.1?

  • Used by Debian/Ubuntu for hostname binding
  • Avoids conflicts with localhost
  • Still loopback (local machine)

Best practice: always update /etc/hosts after hostname change

.local Hostnames and mDNS

Image

Image

Image

What is .local?

.local is a reserved domain for multicast DNS (mDNS).

It ensures:

  • No internet DNS lookup
  • Local-network only resolution
  • Future-proof against new public TLDs

Why .local Is Required

❌ Bad:

server.london

✔ Good:

server.london.local

.local guarantees the name never escapes your LAN.

How mDNS Works (Conceptually)

  1. Device sends a multicast query:
   Who is raspberrypi.local?
  1. All devices receive it
  2. The correct host replies:
   I am raspberrypi.local → 192.168.1.29

No central DNS server required.

mDNS Implementations

OS Implementation
macOS Bonjour / Zeroconf
Linux Avahi
Windows Partial support
Routers Often integrated

Linux Requirements (Important)

On some distributions (e.g. CentOS):

sudo dnf install nss-mdns
sudo reboot

Without this:

  • Others can resolve your host
  • Your system cannot resolve others

Capturing mDNS Traffic (Wireshark)

Image

Image

Image

Filter:

mdns

You will see:

  • Multicast IPv6 packets
  • Query + response messages
  • Host announcing its IP

Best Practice for Local Networking

✔ Always use:

hostname.local

✔ Or use static IPs if stability is critical

✔ Avoid bare hostnames without .local

HTTP — How the Web Actually Works

Image

Image

Image

HTTP basics

  • Runs on TCP
  • Text-based protocol
  • Request → Response model

Inspecting HTTP in Browser

  1. Right-click → Inspect
  2. Open Network tab
  3. Reload page
  4. Click request → Headers

Example request:

GET / HTTP/1.1
Host: www.google.com
User-Agent: Firefox
Accept: text/html

HTTP Response

HTTP/1.1 200 OK
Content-Type: text/html
Content-Encoding: br

Then:

  • HTML
  • CSS
  • JS
  • Images (separate requests)

Manual HTTP Using Telnet

Image

Image

Image

Open TCP connection

telnet www.google.com 80

Send HTTP request

GET / HTTP/1.1
Host: www.google.com

(blank line required)

Result

  • Server replies with headers + HTML
  • Pure text over TCP

Why This Matters for DevOps

  • Test server behavior
  • Debug load balancers
  • Validate HTTP compliance
  • Fuzz malformed requests safely

Example malformed request:

HELLO WORLD HTTP/9.9

Expected result:

400 Bad Request

A good server never crashes.

IPv4 vs IPv6 (Practical View)

Image

Image

Image

IPv4

  • 32-bit
  • ~4.3 billion addresses
  • NAT required
  • Still dominant

Example:

192.168.1.10

IPv6

  • 128-bit
  • 3.4 × 10³⁸ addresses
  • No NAT required
  • Hierarchical routing
  • Better scalability

Example:

2001:db8::1

IPv6 Address Shortening

Full:

2001:0db8:0000:0000:0000:0000:0000:0001

Shortened:

2001:db8::1

Rules:

  • Remove leading zeros
  • :: only once

Why IPv6 Is Better

✔ No NAT
✔ Easier routing
✔ Every device gets public IP
✔ Firewalls replace NAT for security

Why IPv4 Still Matters

  • Many ISPs still IPv4-only
  • Legacy systems
  • IPv6 transition is slow

Dual Stack Is the Correct Strategy

✔ IPv4 + IPv6 enabled
✔ Servers reachable on IPv4
✔ IPv6 preferred internally
✔ Fallback always works

DevOps Recommendation

Scenario Recommendation
Internal networks Dual stack
Public servers IPv4 mandatory
Future-proofing Add IPv6
Debugging Test both stacks

Wireshark IPv6 View

You’ll see:

  • Longer IP headers
  • ICMPv6
  • mDNS heavily uses IPv6

IPv6 ≠ exotic — it’s already active.

Key Takeaways

✔ Hostnames simplify local networking
✔ Always update /etc/hosts
✔ Use .local for LAN resolution
✔ mDNS uses multicast (not DNS)
✔ HTTP is plain text over TCP
✔ Telnet is a powerful debug tool
✔ IPv6 removes NAT limitations
✔ IPv4 still required today

SSH — Secure Shell (Concepts & Real Usage)

Image

Image

Image

What is SSH?

SSH (Secure Shell) is a cryptographic network protocol used to securely access and manage remote systems over a network.

SSH provides:

  • Confidentiality (encryption)
  • Integrity (tamper detection)
  • Authentication (verifying identities)

SSH is one of the most important tools for Linux, DevOps, Cloud, and Security engineers.

Common SSH Use Cases

  1. Remote shell access
  • Execute commands on a remote server
  • Administer systems without physical access
  1. Secure file transfer
  • scp (Secure Copy)
  • sftp (SSH File Transfer Protocol)
  1. Tunneling / Port forwarding
  • Securely forward other protocols through SSH

In this course, we focus on shell access, file transfer, and security basics.

SSH Architecture

Image

Image

SSH always consists of two components:

1. SSH Server (sshd)

  • Runs on the remote machine
  • Listens for incoming connections
  • Usually installed on servers

2. SSH Client (ssh)

  • Runs on your local machine
  • Used to connect to the SSH server
  • Preinstalled on Linux, macOS, Windows

Real-World Context

  • Cloud servers do not have monitors
  • You never “log in physically”
  • SSH is the primary control channel

Everything you practice here applies directly to:

  • AWS EC2
  • Azure VMs
  • Google Cloud
  • Data center servers
  • Raspberry Pi devices

Network Setup Options for SSH Practice

You need two machines that can reach each other.

Method 1 — Host → Virtual Machine (Recommended)

Image

Image

  • VM uses Bridged Adapter
  • VM becomes a real device on your LAN
  • Host connects directly to VM via SSH

Pros

  • Simple
  • Realistic
  • Easy debugging

Cons

  • May be blocked on corporate networks

Method 2 — VM → VM (Always Works)

Image

Image

  • Two VMs inside a NAT Network
  • VMs can reach each other
  • No dependency on host or corporate LAN rules

Pros

  • Works everywhere
  • Fully isolated

Cons

  • Slightly less realistic than bridged mode

VirtualBox NAT Network Setup (Reliable)

Key steps:

  1. Power off VM
  2. Clone VM
  3. Generate new MAC addresses
  4. Create NAT Network
  5. Attach both VMs to that network
  6. Boot both machines
  7. Verify connectivity

Verify IP addresses

ip addr show

Verify connectivity

ping <other_vm_ip>
ping ubuntu.local

If ping works → SSH will work.

Bridged Networking (Host → VM)

Image

Image

What Bridged Mode Does

  • VM shares physical NIC (Ethernet/Wi-Fi)
  • VM gets real IP from your router
  • Appears as a separate device on LAN

After enabling bridged mode

ip addr show

You should see:

192.168.x.x

Test from host

ping 192.168.x.x
ping ubuntu.local

Installing SSH Server (Ubuntu)

Image

Image

On the machine you want to control:

sudo apt update
sudo apt install openssh-server

Verify service:

systemctl status ssh

SSH server starts automatically.

Connecting with SSH

Basic Syntax

ssh username@host

Examples:

If username is omitted:

ssh host

SSH uses your local username by default.

First Connection Warning (Fingerprint)

You may see:

The authenticity of host cannot be established.

This is normal.

Type:

yes

This stores the server’s host key fingerprint.

We will cover this security mechanism in detail later.

Successful SSH Session

Once connected:

  • Your terminal controls the remote machine
  • Commands behave exactly like local shell
  • exit closes the connection

SSH Security: Essential Practices

Image

Image

SSH is encrypted, but exposure still matters.

1. Use Strong Passwords

  • Long
  • Unique
  • Mixed characters
  • Avoid dictionary words

Bad:

sanfrancisco

Good:

A9$eP7!xQm

2. Protect Active Sessions

  • Never leave SSH sessions unattended
  • Lock screen or disconnect
  • Anyone with your open terminal has server access

3. Change Default SSH Port

Why?

  • Port 22 is scanned constantly
  • Automated bots attempt brute force logins
  • Log files become noisy
  • Changing ports reduces noise (not absolute security)

Change SSH Port (Ubuntu)

Edit config:

sudo nano /etc/ssh/sshd_config

Change:

Port 22

To:

Port 2222

Save file.

Validate & Restart SSH

Always validate before restart:

sudo sshd -t

If no output → config is valid.

Restart service:

sudo systemctl restart ssh

Existing sessions remain active.

Connect Using New Port

ssh -p 2222 user@host

Example:

ssh -p 2222 [email protected]

Important Port Warning

Some networks block uncommon ports:

  • Coffee shops
  • Corporate Wi-Fi
  • Public hotspots

Solutions

  • Choose another port
  • Use VPN
  • Use SSH over port 443 if required

SSH Logs & Monitoring

Image

Image

Ubuntu / Debian

/var/log/auth.log

CentOS / RHEL

/var/log/secure

View SSH activity:

grep sshd /var/log/auth.log

You will see:

  • Successful logins
  • Failed password attempts
  • Source IPs
  • Target usernames

Changing the SSH port makes real attacks visible, not buried in noise.

Why This Matters in Production

SSH is:

  • Your primary control channel
  • Your highest-risk exposed service
  • The first target of attackers

Understanding SSH deeply is non-negotiable for:

  • DevOps
  • Cloud Engineers
  • SRE
  • Security Engineers

Restrict SSH Access to Specific Users

Image

Image

By default:

All local users with passwords can SSH

This is not ideal.

Allow Only Specific Users

In sshd_config:

AllowUsers yannis

Multiple users:

AllowUsers yannis deploy admin

Validate & restart:

sudo sshd -t
sudo systemctl restart sshd

⚠️ Lockout Warning (Very Important)

If you mistype the username:

  • SSH will reject everyone
  • If this is a remote server → you are locked out

How to Avoid Locking Yourself Out (CRITICAL)

Image

Image

Golden Rule

Always keep one SSH session open.

Why?

  • SSH sessions are independent processes
  • Existing sessions survive SSH restarts

Safe Workflow

  1. Open Terminal A
  2. Connect via SSH
  3. Make SSH changes
  4. Test from Terminal B
  5. If broken → fix using Terminal A
  6. Only close Terminal A when confirmed

Example: SSH stopped

sudo systemctl stop sshd
  • Existing session → still alive
  • New connections → rejected

You can fix it:

sudo systemctl start sshd

This prevents rescue-mode recovery.

SSH Key Authentication (Passwordless & Secure)

Image

Image

Passwords are:

  • Guessable
  • Brute-forceable
  • Inconvenient for automation

SSH keys solve all of this.

How SSH Keys Work (Concept)

  • Private key → stays on your machine
  • Public key → copied to server
  • Server verifies identity without passwords
  • Private key is never transmitted

Generate SSH Key (Client Machine)

ssh-keygen -t rsa -b 4096

Press Enter for defaults.

Files created:

~/.ssh/id_rsa       (PRIVATE – never share)
~/.ssh/id_rsa.pub   (PUBLIC – safe to share)

Copy Public Key to Server

ssh-copy-id -i ~/.ssh/id_rsa.pub -p 2222 user@server

Enter your password once.

Login Without Password

ssh -p 2222 user@server

✔ No password
✔ Secure
✔ Perfect for automation

Server-Side Key Storage

Location:

~/.ssh/authorized_keys

Permissions:

~/.ssh            → 700
authorized_keys   → 600

Each line = one allowed public key
Comments help identify owners.

Why SSH Keys Are Essential

  • Impossible to brute-force
  • Required for CI/CD
  • Required for automation
  • Required for production security

SSH keys are not optional in real environments.

Disable Password Authentication for SSH (Key-Only Login)

Image

Image

Why Disable Password Authentication?

Now that public/private key authentication is configured, allowing password login is unnecessary and risky.

Security Benefits

  1. Massive attack-surface reduction
  • Passwords can be brute-forced
  • SSH keys are thousands of characters long
  • Practically impossible to guess
  1. Two-layer security
  • SSH key → login
  • Password → sudo (privilege escalation)
  1. Even if SSH access is compromised
  • Attacker still needs the user password
  • Root login is already disabled
  • Privilege escalation is blocked

How Authentication Works After This Change

Login

  • Uses private key
  • No password accepted

System changes

sudo <command>
  • Still requires user password

This means:

SSH key ≠ root access

Verify Key-Based Login Works (Before Disabling Passwords)

From your client:

ssh -p 2222 user@server

You should log in without a password prompt.

⚠️ If this does not work, STOP. Do not continue.

Disable Password Authentication

Edit SSH server configuration:

sudo nano /etc/ssh/sshd_config

Set:

PasswordAuthentication no

(Optional but recommended)

PermitEmptyPasswords no

Validate & Apply Configuration

sudo sshd -t

No output = configuration is valid

Restart SSH:

sudo systemctl restart sshd

Test Enforcement (Important)

Switch to a user without SSH keys (or another local user):

ssh -p 2222 user@server

Expected result:

Permission denied (publickey).

✔ Password login is now fully disabled
✔ Only authorized SSH keys can log in

Critical Warnings (Very Important)

1. Other Users Will Be Locked Out

If teammates still use passwords:

  • They must add SSH keys
  • Otherwise access is lost

2. Losing Your Private Key = Lost Access

If your laptop is:

  • Lost
  • Damaged
  • Encrypted drive wiped

You cannot log in.

Best Practice

  • Create at least two SSH keys
  • Store on different devices
  • Add both public keys to authorized_keys

3. If Private Key Is Leaked

If someone gets your private key:

  • ALL servers using that key are compromised
  • You must:
  1. Remove public key from all servers
  2. Generate a new key pair
  3. Re-deploy keys everywhere

Prevent SSH Connection Drops (Keep-Alive)

Image

Image

The Problem

SSH connections may drop if:

  • No activity for a long time
  • NAT, firewall, or router times out
  • You take a break (lunch, meeting, coffee)

This is annoying and dangerous:

  • Lost working directory
  • Lost environment state
  • Possible lockout during SSH changes

The Solution: Keep-Alive Packets

SSH can send empty packets periodically to keep the connection alive.

Best practice:

Configure this on the client, not the server.

Configure SSH Keep-Alive (Client Side)

Edit user SSH config:

nano ~/.ssh/config

Add:

Host *
    ServerAliveInterval 60
    ServerAliveCountMax 3

Meaning

  • Every 60 seconds → send keep-alive packet
  • Allow 3 missed responses
  • Prevents idle disconnects

Secure the Config File

chmod 600 ~/.ssh/config

Result

  • SSH sessions stay alive for hours
  • No random disconnects
  • Safe during breaks
  • Extremely useful during server maintenance

As long as:

  • Internet does not drop completely
  • Laptop stays powered

Your SSH session remains active.

Why This Matters in Production

These features prevent:

  • Locking yourself out
  • Losing work mid-operation
  • SSH disconnects during critical changes

This is mandatory knowledge for:

  • DevOps Engineers
  • Cloud Engineers
  • SREs
  • Linux Administrators

SSH Fingerprints: Why They Are Critical for Security

Image

Image

What Is an SSH Fingerprint?

  • Every SSH server generates host keys when sshd is installed
  • A fingerprint is a cryptographic hash of that host key
  • It uniquely identifies that exact server

When you connect for the first time, SSH asks:

“Are you sure you want to continue connecting?”

Once accepted, the fingerprint is saved locally.

Where Fingerprints Are Stored (Client Side)

~/.ssh/known_hosts

This file maps:

hostname → fingerprint

From that moment on:

  • SSH expects the fingerprint to remain the same
  • Any change triggers a security warning

Why Fingerprint Warnings Must NEVER Be Ignored

Fingerprint Change = Red Flag

If SSH says:

WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!

Possible causes:

  1. DNS now resolves to a different server
  2. Server was reinstalled
  3. Man-in-the-middle attack

What Is a Man-in-the-Middle (MITM) Attack?

Image

Image

Instead of connecting directly:

You → Attacker → Real Server

The attacker:

  • Creates their own SSH host key
  • Forwards traffic to the real server
  • Can read passwords or commands

SSH fingerprints are what detect this attack.

Why Encryption Alone Is Not Enough

SSH traffic is encrypted, but:

  • Encryption only protects the transport
  • If the endpoint is wrong, encryption does not help

Fingerprint verification proves the server identity.

How to Manually Verify an SSH Fingerprint (Best Practice)

Step 1: Get the fingerprint from the server (trusted access)

On the server:

ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub

(Or RSA if used)

Step 2: Compare with client warning

SSH shows:

SHA256:QOG...ELkww

Both must match exactly.

Only then should you type:

yes

Important Reality Check

If SSH is your only access:

  • First connection always requires one trust decision
  • Best practice: trust once, verify manually, then never ignore warnings again

After first trust:

  • Any future warning = stop and investigate

SFTP: Secure File Transfer Over SSH

Image

Image

What Is SFTP?

  • SSH File Transfer Protocol
  • Built on top of SSH
  • Fully encrypted
  • Uses same authentication (password or SSH key)

SSH includes:

  • Shell access
  • SFTP access

(Some servers allow SFTP only, no shell)

GUI Access via SFTP (Linux)

In file manager:

sftp://user@hostname
  • Supports SSH keys automatically
  • Shows fingerprint warning on first connection
  • Permissions enforced by Linux users

CLI File Transfer Using SCP

Copy file from server → local

scp user@server:/home/user/file.txt .

Copy directory recursively

scp -r user@server:/home/user/folder .

Copy local → server

scp file.txt user@server:/home/user/

Specify SSH port

scp -P 2222 file.txt user@server:/home/user/

⚠️ SCP uses uppercase -P
⚠️ SSH uses lowercase -p

Using Cyberduck (Mac & Windows)

Image

Image

Cyberduck supports:

  • SFTP
  • SSH key authentication
  • Drag & drop
  • Permission management

Steps:

  1. Select SFTP
  2. Enter host, port, username
  3. Choose SSH private key (recommended)
  4. Verify fingerprint
  5. Connect

screen: Shared & Persistent Terminal Sessions

Image

Image

What Is screen?

  • Terminal multiplexer
  • Creates a virtual terminal
  • Multiple users can attach to the same session
  • Session survives SSH disconnects

Why DevOps Engineers Use screen

  • Collaborative debugging
  • Long-running processes
  • Server maintenance
  • Pair troubleshooting over SSH

Install screen

# Ubuntu
sudo apt install screen

# CentOS
sudo dnf install screen

Basic screen Workflow

Start a session

screen

Detach (leave it running)

Ctrl + A, then Ctrl + D

List sessions

screen -ls

Reattach

screen -x <session-id>

Sharing a Terminal with a Colleague

  1. You start screen
  2. Colleague SSHs into same server
  3. Colleague runs:
screen -x

Now:

  • Both see the same terminal
  • Both can type
  • Ideal for live collaboration

Exit vs Detach (Important)

Action Effect
exit Terminates the session
Ctrl+A Ctrl+D Detaches safely

To fully stop screen:

exit
exit

(Exit shell → exit screen)

Why screen Belongs in Your Toolbox

  • No external software
  • Works over SSH
  • Extremely reliable
  • Used in real production environments

Summary

SSH Security

  • Fingerprints protect against MITM
  • Never ignore fingerprint warnings
  • Verify once, trust forever

File Transfer

  • SFTP = secure, encrypted, simple
  • SCP for CLI automation
  • GUI tools supported

Collaboration

  • screen enables shared terminals
  • Safe, fast, SSH-native

This completes a professional, real-world SSH workflow used daily by DevOps engineers.

My Data Science Journey: Restaurant Tips Analysis

2025-12-28 16:04:57

Project: Exploratory Data Analysis on Restaurant Tips Dataset
Duration: Full EDA Process
Dataset: 243 restaurant transactions, 7 variables
Status: ✅ COMPLETED

📊 PROJECT OVERVIEW

Dataset Information

  • Source: Restaurant tips dataset
  • Initial Size: 244 rows × 7 columns
  • Final Size: 243 rows × 7 columns (after cleaning)
  • Variables:
    • Numerical: total_bill, tip, size
    • Categorical: sex, smoker, day, time

Project Goal

Understand what factors influence tipping behavior in restaurants through comprehensive exploratory data analysis.

🧹 PHASE 1: DATA CLEANING (Investigation 1.3)

1.1 Missing Values Investigation

Hypothesis: "The Null Hypothesis" - Why might data be missing?

What I Did:

# Checked for missing values
data.isnull().sum()
data.isnull().any()
(data.isnull().sum() / len(data)) * 100  # Percentage

Results:

  • 0 missing values in all columns
  • This indicated excellent data collection quality
  • No imputation or removal needed

Learning Moment: Not all datasets have missing data, but always check!

1.2 Duplicate Detection

Hypothesis: Could identical transactions exist legitimately?

What I Did:

# Found duplicates
num_duplicates = data.duplicated().sum()
duplicates = data[data.duplicated(keep=False)]

# Removed them
data_clean = data.drop_duplicates()

Results:

  • Found 1 duplicate row
    • Bill: $13.00, Tip: $2.00, Female, Smoker, Thursday, Lunch, Party of 2
    • Row 198 and Row 202 were IDENTICAL
  • Decision: Removed as likely data entry error
  • Result: 244 rows → 243 rows

Key Insight: Identical transactions on same day/time are statistically improbable - likely errors.

1.3 Outlier Investigation

Hypothesis: "The Outlier Tribunal" - Are extreme values errors or legitimate?

What I Did:

# Created boxplots
plt.boxplot(data['tip'])
plt.boxplot(data['total_bill'])

# Calculated IQR boundaries
Q1 = data['tip'].quantile(0.25)
Q3 = data['tip'].quantile(0.75)
IQR = Q3 - Q1
upper_boundary = Q3 + (1.5 * IQR)

# Found outliers
outliers = data[data['tip'] > upper_boundary]

Mathematical Formula:

IQR = Q3 - Q1
Upper Boundary = Q3 + (1.5 × IQR)
Lower Boundary = Q1 - (1.5 × IQR)

For Tips:
Q1 = $2.00
Q3 = $3.56
IQR = $1.56
Upper Boundary = $5.90

Outliers Found:
| Bill | Tip | Tip % | Verdict |
|---------|--------|-------|-----------------|
| $50.81 | $10.00 | 19.7% | ✅ Legitimate |
| $48.33 | $9.00 | 18.6% | ✅ Legitimate |
| $39.42 | $7.58 | 19.2% | ✅ Legitimate |
| $48.27 | $6.73 | 13.9% | ✅ Legitimate |

Decision: Kept all outliers - they represent large parties with reasonable tip percentages

Key Insight: Outliers aren't always errors! Verify with context (tip percentage in this case).

🔬 PHASE 2: BIVARIATE ANALYSIS (Investigation 2.2)

Overview: Testing 7 Relationships

For each relationship, I followed the scientific method:

  1. Hypothesis - Make a prediction
  2. Visualization - Create appropriate chart
  3. Analysis - Interpret the pattern
  4. Conclusion - Accept or reject hypothesis

2.1 Relationship #1: Total Bill → Tip

My Hypothesis:

  • "As total_bill increases, tip will increase WEAKLY"
  • Reasoning: "Tip is 'keep the change' - not percentage based"
  • Confidence: MEDIUM
  • Expected: Weak/no relationship

What I Did:

plt.scatter(data_clean['total_bill'], data_clean['tip'])
plt.xlabel('Total Bill ($)')
plt.ylabel('Tip ($)')
plt.title('Relationship Between Total Bill and Tip Amount')
plt.show()

Results:

  • Pattern: Strong upward linear trend
  • Correlation: r = 0.67 (Strong positive)
  • Points tightly clustered around imaginary line

Hypothesis Verdict:REJECTED

What I Learned:

  • My hypothesis was WRONG - and that's okay!
  • Reality: People tip 15-20% of bill (percentage-based, not "keep change")
  • Mechanism: Bill × 15-20% = Tip (mathematical relationship)
  • Key insight: "Learning happens with mistakes" - being wrong is part of science!

Business Insight: Higher bills = higher tips. Restaurants should encourage higher spending.

2.2 Relationship #2: Party Size → Total Bill

My Hypothesis:

  • "As party size increases, total_bill increases STRONGLY"
  • Reasoning: "More people = more food (obvious!)"
  • Confidence: HIGH

What I Did:

plt.scatter(data_clean['size'], data_clean['total_bill'])

Results:

  • Pattern: Grouped upward trend (vertical columns)
  • Correlation: r = 0.60 (Medium-strong positive)
  • Party size = discrete (1,2,3,4,5,6), not continuous
  • Size 2 most common, with widest bill range ($10-$40)

Hypothesis Verdict:CONFIRMED (but weaker than expected)

Key Insight:

  • "Party size predicts bill, but doesn't determine it completely"
  • A couple can outspend a group of 4 depending on what they order
  • What people ORDER matters more than HOW MANY people

2.3 Relationship #3: Party Size → Tip

My Hypothesis:

  • "As party size increases, tip increases STRONGLY"
  • Reasoning: "More people → bigger bill → percentage-based tip → more tip"
  • Confidence: HIGH

Results:

  • Pattern: Upward trend from size 1-4, then FLATTENS at 5-6
  • Correlation: r = 0.49 (Medium-weak)
  • Non-linear relationship!

Hypothesis Verdict: ⚠️ PARTIALLY CORRECT

Surprising Discovery:

  • Tips increase up to party size 4
  • Tips PLATEAU at sizes 5-6 (don't increase further!)

Possible Explanations:

  1. Automatic gratuity - Restaurants add mandatory 15-18% for large parties
  2. Social loafing - "Someone else will tip well, so I don't need to"
  3. Different occasions - Large parties = kids/families (tip standard)
  4. Splitting complications - Harder to calculate when splitting 6 ways

Key Insight: Large parties tip differently than expected - real behavioral economics!

2.4 Relationship #4: Day of Week → Tip

My Hypothesis:

  • Highest: Sunday (weekend celebration mood)
  • Lowest: Wednesday (people just filling stomach)
  • Expected difference: MEDIUM
  • Confidence: MEDIUM

What I Did:

sns.boxplot(x='day', y='tip', data=data_clean,
            order=['Sun','Mon','Tue','Wed','Thur','Fri','Sat'])

Results:
| Day | Avg Tip | Verdict |
|-----------|---------|----------------|
| Saturday | $3.00 | 🏆 Highest |
| Sunday | $2.90 | High |
| Mon/Tue/Wed| $2.25 | 🔻 Lowest (tie)|

Hypothesis Verdict: ⚠️ PARTIALLY WRONG

What I Got Wrong:

  • Predicted Sunday highest → Actually Saturday highest
  • Predicted Wednesday lowest → Correct (tied with Mon/Tue)

Key Observations:

  • Saturday has most high-tip outliers (special occasions, date nights)
  • Sunday has LARGEST box (most variation) - diverse crowd
  • Weekdays cluster together (consistent lower tipping)

Key Insight:
"Sunday = diverse people = diverse tipping = large variation in tips"

2.5 Relationship #5: Time (Lunch vs Dinner) → Tip

My Hypothesis:

  • Dinner will have higher tips
  • Reasoning: "Night time = people more generous; lunch = people in rush"
  • Confidence: MEDIUM

Results:
| Time | Avg Tip | Difference |
|--------|---------|------------|
| Dinner | $3.00 | — |
| Lunch | $2.20 | -$0.80 |

Hypothesis Verdict:CONFIRMED!

Key Insight:

  • $0.80 difference - this is the BIGGEST categorical effect!
  • Time of day is the STRONGEST categorical predictor
  • Lunch customers are rushed, less satisfied with service
  • Dinner is relaxed, celebratory atmosphere

Business Recommendation: Prioritize dinner service quality!

2.6 Relationship #6: Sex (Male vs Female) → Tip

My Hypothesis:

  • Males will tip MORE
  • Reasoning: "Female waitresses + male customers trying to impress"
  • Confidence: MEDIUM

Results:
| Sex | Avg Tip | Difference |
|--------|---------|------------|
| Female | $3.20 | — |
| Male | $3.00 | -$0.20 |

Hypothesis Verdict:REJECTED!

What I Got Wrong:

  • Females actually tip SLIGHTLY more (or it's basically equal)
  • The difference is minimal ($0.20)
  • Sex is NOT a strong predictor

Key Insight: Gender stereotypes about tipping don't hold up in data!

2.7 Relationship #7: Smoker vs Non-Smoker → Tip

My Hypothesis:

  • Non-smokers will tip MORE
  • Reasoning: "Smokers save money for cigarettes instead of tipping"
  • Confidence: LOW

Results:
| Smoker Status | Avg Tip | Difference |
|---------------|---------|------------|
| Smokers | $3.00 | — |
| Non-smokers | $2.80 | -$0.20 |

Hypothesis Verdict:REJECTED!

Honest Reflection: "Cannot figure out why" - and that's okay!

Possible Explanations:

  • Smokers sit outside/at bar (different atmosphere?)
  • Correlation, not causation (maybe age/demographic differences)
  • Small difference ($0.20) might be random chance
  • Need more data to understand

Key Insight: Not every pattern has an obvious explanation - intellectual honesty matters!

📈 PHASE 3: CORRELATION ANALYSIS

What I Did:

# Correlation matrix
correlation_matrix = data_clean[['total_bill', 'tip', 'size']].corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')

Results:

Pair Correlation Strength Interpretation
total_bill ↔ tip 0.67 Strong 🏆 Strongest predictor
size ↔ total_bill 0.60 Medium-Strong More people = more food
size ↔ tip 0.49 Medium-Weak Non-linear (plateaus)

Key Insight:
"Tip percentage is fixed as that of total bill" - this explains the 0.67 correlation perfectly!

🎨 PHASE 4: PAIRPLOT (Visual Summary)

What I Did:

sns.pairplot(data_clean, 
             vars=['total_bill', 'tip', 'size'],
             hue='time',  # Color by lunch/dinner
             diag_kind='hist')

Observations:

From diagonal (distributions):

  • Most common tip: ~$2-3
  • Most common bill: ~$15-20
  • Most common party size: 2 people

From scatter plots:

  • total_bill vs tip: Clear upward trend (confirms r=0.67)
  • size vs others: Grouped patterns (discrete variable)
  • Lunch (blue) vs Dinner (orange): Overlap mostly, but dinner shifts slightly higher

Overall Impression: Relationships are SOMEWHAT CLEAR - not perfect, but strong enough to be meaningful

🎯 KEY FINDINGS SUMMARY

Strongest Predictors (Ranked):

  1. Total Bill (r=0.67) 🥇

    • Explains ~45% of tip variation (r² = 0.67² = 0.45)
    • Clear linear relationship
    • Percentage-based tipping (15-20%)
  2. Time of Day ($0.80 difference) 🥈

    • Dinner tips $0.80 more than lunch
    • Strongest categorical effect
    • Reflects rushed vs relaxed dining
  3. Party Size (r=0.49 with tip) 🥉

    • Medium effect, but NON-LINEAR
    • Plateaus at size 5-6
    • Different behavior for large groups
  4. Day of Week ($0.75 difference)

    • Saturday highest ($3.00)
    • Weekdays lowest (~$2.25)
    • Weekend vs weekday effect
  5. Sex ($0.20 difference) - WEAK

    • Minimal difference
    • Nearly equal tipping
  6. Smoker Status ($0.20 difference) - WEAK

    • Minimal difference
    • Unexplained pattern

💡 BUSINESS RECOMMENDATIONS

Based on data analysis, restaurant owners should:

1. FOCUS ON INCREASING BILL AMOUNT 🎯

Why: Strongest correlation (0.67) - higher bills directly lead to higher tips

Actions:

  • Upsell appetizers, drinks, desserts
  • Create combo deals that increase bill
  • Train servers on suggestive selling
  • Offer premium menu items

Expected Impact: 10% increase in average bill → ~10% increase in tips

2. PRIORITIZE DINNER SERVICE 🌙

Why: Dinner tips $0.80 (36%) more than lunch

Actions:

  • Allocate best servers to dinner shift
  • Focus marketing on dinner hours
  • Create special dinner ambiance
  • Dinner-specific promotions

Expected Impact: Shift focus to higher-margin time period

3. OPTIMIZE FOR PARTY SIZES 2-4 👥

Why: These sizes have best tip-to-effort ratio

Actions:

  • Table arrangements favor 2-4 person parties
  • Special deals for couples/small groups
  • Don't overinvest in large party accommodations (tips plateau)

Expected Impact: Maximize tips per table/server time

4. WEEKEND FOCUS 📅

Why: Saturday/Sunday have higher tips

Actions:

  • Premium staffing on weekends
  • Weekend specials/events
  • Higher-end menu items on weekends

5. DON'T DISCRIMINATE BY SEX/SMOKER ⚖️

Why: These factors have minimal effect

Insight: Treat all customers equally - demographics don't significantly predict tipping

🧠 PERSONAL LEARNING JOURNEY

What I Learned About Data Science:

1. The Scientific Method Works!

  • Make hypothesis → Test → Analyze → Conclude
  • Being wrong is GOOD - that's how we learn!
  • Quote: "Learning happens with mistakes"

2. Hypotheses Can Be Wrong

My Wrong Predictions:

  • ❌ Thought tipping was "keep the change" → Actually percentage-based
  • ❌ Thought Sunday would have highest tips → Actually Saturday
  • ❌ Thought males tip more → Actually nearly equal
  • ❌ Thought smokers tip less → Actually slightly more

Lesson: Don't trust assumptions - test with data!

3. Correlation ≠ Causation

  • Smokers tip more, but WHY?
  • Could be confounding variables (age, location, etc.)
  • Need more data to understand mechanisms

4. Context Matters

  • Outliers aren't always errors
  • $10 tip on $50 bill = 20% (normal!)
  • Always calculate percentages/ratios for context

5. Data Quality First

  • Clean data = reliable analysis
  • Check for: missing values, duplicates, outliers
  • This dataset was excellent (0 missing!)

6. Visualization is Powerful

  • Scatter plots → see relationships
  • Box plots → compare groups
  • Correlation matrix → see everything at once
  • Pairplot → ultimate summary

7. Different Charts for Different Data

  • Numerical vs Numerical → Scatter plot
  • Categorical vs Numerical → Box plot / Bar chart
  • All at once → Pairplot, Correlation matrix

What I Learned About Python/Tools:

Python Libraries:

import pandas as pd           # Data manipulation
import numpy as np            # Numerical operations
import matplotlib.pyplot as plt  # Basic plotting
import seaborn as sns         # Statistical plotting

Key Functions Mastered:

Pandas:

data.head()              # First 5 rows
data.shape               # Dimensions
data.describe()          # Statistics
data.isnull().sum()      # Missing values
data.duplicated().sum()  # Duplicates
data.drop_duplicates()   # Remove duplicates
data['column']           # Select column
data[condition]          # Filter rows
data.groupby().mean()    # Group and aggregate
data.corr()              # Correlation matrix

Matplotlib:

plt.scatter(x, y)        # Scatter plot
plt.xlabel()             # X-axis label
plt.ylabel()             # Y-axis label
plt.title()              # Title
plt.grid()               # Grid lines
plt.show()               # Display plot

Seaborn:

sns.boxplot(x='category', y='number', data=df)
sns.heatmap(corr_matrix, annot=True)
sns.pairplot(df, vars=['col1','col2'], hue='category')

Important Concepts:

1. Order Matters!

# WRONG
outliers = data[data['tip'] > 5]
data['tip_pct'] = data['tip'] / data['total_bill']
print(outliers['tip_pct'])  # ERROR!

# RIGHT
data['tip_pct'] = data['tip'] / data['total_bill']
outliers = data[data['tip'] > 5]
print(outliers['tip_pct'])  # WORKS!

2. Matplotlib vs Seaborn:

  • matplotlib = Low-level, flexible, more code
  • seaborn = High-level, easy, pretty defaults
  • Use both together!

3. Data Types Matter:

  • float64 / int64 = Can do math
  • object = Text, can't do math

Skills I Developed:

Technical Skills:

  • Data cleaning and preprocessing
  • Exploratory data analysis
  • Statistical thinking
  • Data visualization
  • Python programming
  • Using Jupyter notebooks

Analytical Skills:

  • Hypothesis formation
  • Pattern recognition
  • Critical thinking
  • Drawing insights from data
  • Making business recommendations

Soft Skills:

  • Scientific method application
  • Intellectual honesty ("I don't know")
  • Learning from mistakes
  • Persistence through challenges
  • Clear communication of findings

📊 COMPLETE VISUALIZATIONS CREATED

  1. ✅ Scatter Plot: total_bill vs tip
  2. ✅ Scatter Plot: size vs total_bill
  3. ✅ Scatter Plot: size vs tip
  4. ✅ Box Plot: tip by day of week
  5. ✅ Box Plot: tip by time (lunch/dinner)
  6. ✅ Box Plot: tip by sex
  7. ✅ Box Plot: tip by smoker status
  8. ✅ Correlation Matrix Heatmap
  9. ✅ Pairplot (all relationships)

Total: 9 professional visualizations

🎓 FINAL REFLECTION

What Worked Well:

  • Systematic approach (hypothesis → test → analyze)
  • Using appropriate visualizations for each relationship
  • Being open to being wrong
  • Thorough data cleaning before analysis

What I'd Do Differently:

  • Could explore interaction effects (e.g., day × time)
  • Could calculate tip percentages earlier for context
  • Could test non-linear relationships more formally

Most Surprising Finding:

"Party size plateaus at 5-6 people!"

  • Expected linear relationship
  • Discovered real-world behavioral economics
  • Shows the value of looking at data, not just assumptions

Most Important Lesson:

"Total amount of spending is the determining factor"

  • Simple but powerful
  • Actionable for businesses
  • Data-driven decision making

End of Journey Summary

"The only real mistake is the one from which we learn nothing." - Henry Ford

"In God we trust. All others must bring data." - W. Edwards Deming

Understanding Cookies from the Ground Up: Part 1 - Fundamentals and the Critical Difference between 1st and 3rd Party Cookies

2025-12-28 16:03:03

When developing web applications, we often encounter challenges related to session management or tracking. Most of these issues trace back to a fundamental understanding of Cookies.

  • "Why aren't my cookies being sent as expected?"
  • "Why does a cookie persist even after I try to delete it?"
  • "Why does the behavior change depending on the browser?"

To solve these problems, it is essential to revisit the basics of how cookies work. In this series, I will organize the fundamentals of cookies, specifically tailored for engineers. In this first part, we will cover the definition of cookies and the crucial distinction between 1st Party and 3rd Party Cookies.

What is a Cookie?

A Cookie is a small piece of data stored in the user's browser.

Since HTTP is a "stateless" protocol—meaning each request is independent and the server doesn't remember previous interactions—cookies play a vital role in maintaining state.

Key Roles of Cookies:

  1. Session Management: Keeping users logged in and managing shopping carts.
  2. Personalization: Saving user settings like dark mode or language preferences.
  3. Tracking: Identifying users across different pages or visits.

1st Party vs. 3rd Party Cookies

One of the most important concepts to understand in modern web development is the difference between these two types.

1. 1st Party Cookies

  • Definition: Cookies issued by the domain the user is currently visiting.
  • Example: If you are visiting example.com, any cookie issued by example.com is a 1st party cookie.
  • Characteristics: These are essential for core site functionality (e.g., login sessions) and face fewer restrictions from browsers.

2. 3rd Party Cookies

  • Definition: Cookies issued by a domain other than the one the user is currently visiting.
  • Example: While visiting example.com, an ad network script from ad-network.net issues its own cookie.
  • Characteristics: Traditionally used for cross-site tracking and advertising. However, they are now heavily restricted or blocked by most modern browsers (Safari, Firefox, and increasingly Chrome) due to privacy concerns.

Why the Distinction Matters

In the context of Risk-Based Authentication (RBA) or security design, understanding this difference is mandatory. You need to know:

  • Which browser is accessing the site?
  • Has this device been authenticated before?
  • Is the cookie consistent across the site?

To build robust systems, you should aim for designs that rely on secure 1st Party Cookies rather than unstable 3rd party ones.

Summary of Part 1

  • Cookies are small data fragments stored in the browser.
  • They provide "state" to the stateless HTTP protocol.
  • 1st Party Cookies are issued by the site itself (essential for UX).
  • 3rd Party Cookies are issued by external domains (heavily restricted).

Understanding these basics is the prerequisite for diving into implementation and browser-specific behaviors.

What's Next?

In Part 2, we will explore "Browser Cookie Behavior" in more detail, including:

  • Where exactly does the browser store cookies?
  • When are they sent?
  • Differences between Chrome, Safari, and Edge.

Why I Chose Voice Over Chat for AI Interviews (And Why It Almost Backfired)

2025-12-28 16:00:26

Most AI interview platforms are glorified chatbots with better questions. We built Squrrel to do something harder: have actual spoken conversations with candidates.

That decision nearly killed the product before launch.

The Obvious Choice That Wasn't Obvious

When I started building Squrrel, the safe play was text-based interviews. Lower latency, fewer technical headaches, easier to parse and analyze. Every AI product manager I talked to said the same thing: "Start with chat. Voice is a nightmare."

They were right about the nightmare part.

But I kept coming back to one fact: 78% of recruiting happens over the phone. Not email. Not Slack. Phone calls. Because hiring managers want to hear how candidates think on their feet, how they structure explanations, whether they can articulate complex ideas clearly.

A text-based interview platform would be easier to build and completely miss the point.

So we went with voice. And immediately discovered why everyone warned us against it.

The Technical Debt I Didn't See Coming

Speech recognition for interviews is different from speech recognition for everything else.

Siri and Alexa are optimized for short commands. Transcription tools like Otter are optimized for meetings with multiple speakers. We needed something that could handle:

20-40 minute monologues about technical projects

Industry jargon that doesn't exist in standard training data ("Kubernetes," "PostgreSQL," "JWT authentication")

Non-native English speakers with varying accents

Candidates who talk fast when nervous or slow when thinking

Off-the-shelf speech-to-text models failed spectacularly. Our first pilot had a 23% word error rate on technical terms. A candidate said "I implemented Redis caching" and got transcribed as "I implemented ready's catching." Recruiters couldn't trust the output.

I spent three weeks fine-tuning Wav2Vec 2.0 on domain-specific data—transcripts from actual tech interviews, recordings of engineers explaining their work, podcasts about software development. Got the error rate down to 6% for technical vocabulary.

But here's what surprised me: the remaining errors weren't random. They clustered around moments of hesitation, filler words, and self-corrections—exactly the moments that reveal how someone thinks under pressure.

We almost removed those "errors" before realizing they were features, not bugs.

The Conversational AI Problem Nobody Talks About

Building an AI that can conduct a natural interview conversation is way harder than building one that asks scripted questions.

The models are good at turn-taking now—knowing when the candidate has finished speaking, when to probe deeper, when to move on. But they're terrible at knowing why to do those things.

Our first version would ask "Tell me about a time you faced a technical challenge" and then immediately jump to the next question, regardless of whether the candidate gave a three-sentence answer or a three-minute story. It felt robotic because it was robotic—no human interviewer would blow past a shallow answer without following up.

We had to build a layer that analyzes response depth and triggers follow-ups. Not just keyword matching—actual semantic understanding of whether the candidate addressed the question or danced around it.

This meant combining LLaMA 3.3 70B for conversation flow with TinyBERT for real-time classification. The large model decides what to ask, the small model decides if the answer was substantive enough to move forward. They run in parallel with about 800ms latency between candidate finishing and AI responding.

That 800ms pause? Candidates tell us it makes the conversation feel more natural. Humans don't respond instantly either.

The Bias Problem That Wasn't a Bias Problem

Everyone asked about bias in AI hiring. "How do you prevent discrimination against protected classes?"

Honest answer? We can't. Not completely.

But we can be transparent about where bias enters the system and give recruiters tools to catch it.

Our approach:

Standardized questions - Every candidate gets asked the same core questions in the same order. This eliminates the biggest source of interviewer bias: one person getting softball questions while another gets grilled.

Anonymized analysis - The AI evaluation doesn't see candidate names, photos, or demographic data. It only sees the transcript and voice characteristics relevant to communication (clarity, pace, coherence—not accent or gender).

Bias audit logs - We track which candidates get follow-up questions and why. If the AI is consistently probing deeper with one demographic group, that pattern surfaces in our analytics.

Human override - Recruiters see the full transcript alongside the AI summary. They can—and do—disagree with the AI's assessment.

The dirty secret of AI hiring tools is that removing human bias is impossible. What's possible is making bias visible and consistent. A human interviewer might grill technical candidates on Tuesdays because they're stressed, then lob softballs on Fridays when they're in a good mood. The AI applies the same standards at 2 PM and 2 AM.

That's not unbiased. It's consistently biased, which is actually useful if you know what you're looking for.

What Breaking Things Taught Me

When we started testing the system, the AI asked a great opening question, then froze for 14 seconds before asking it again. The candidate thought the system crashed and hung up.

The bug? Our conversation state management couldn't handle the candidate pausing to think. The silence triggered a "no response detected" error, which triggered a retry, which created a race condition.

Fixed it by adding a confidence threshold—the AI now distinguishes between "finished talking" silence and "still thinking" silence based on speech patterns in the previous 3 seconds. Not perfect, but it dropped the false-positive rate from 18% to 2%.

Here's the lesson I took away: voice AI in high-stakes scenarios requires defensive design at every layer. Unlike a chatbot where someone can retype their message, you can't ask a candidate to "restart the interview" because your error handling failed.

We built in:

Automatic session recovery if connectivity drops

Manual override for recruiters to flag bad transcriptions

A "pause interview" button for candidates (surprisingly popular)

Playback of the actual audio alongside transcripts

The goal isn't perfection. It's resilience when things go wrong, because they will go wrong.

Why This Matters for Other AI Builders

If you're building AI for professional contexts—interviews, legal analysis, medical screening, financial advice—here's what I'd tell you:

Voice is worth the pain. The richness of verbal communication unlocks insights that text can't capture. But only if you're willing to solve the hard problems instead of shipping a minimum viable chatbot.

Domain-specific fine-tuning isn't optional. General-purpose models are amazing and terrible at the same time. They'll handle 90% of your use case brilliantly, then catastrophically fail on the 10% that matters most. Find that 10% early and train specifically for it.

Latency is a feature. We obsessed over response time at first, trying to get under 500ms. Then we realized that instant responses felt uncanny. The sweet spot for conversational AI is 600-1000ms—fast enough to feel responsive, slow enough to feel natural.

Build for the failure modes. Your AI will misunderstand accents, mishear technical terms, and ask nonsensical follow-ups. Design the system so humans can catch these failures gracefully instead of catastrophically.

The Uncomfortable Truth About AI Products

Six months into building Squrrel, I had a realization that almost made me quit: the AI isn't the product. The product is the workflow that the AI enables.

Candidates don't care that we use Wav2Vec 2.0 for transcription or LLaMA 3.3 for conversation. They care that they can interview at midnight without scheduling four emails. Recruiters don't care about our evaluation algorithms. They care that they can review 10 candidates in an hour instead of spending all week on phone screens.

The AI is infrastructure. The value is in removing friction from a broken process.

This realization changed everything. We stopped optimizing for model accuracy and started optimizing for user experience. We added features like letting candidates preview questions before starting, because that reduced anxiety and led to better responses—even though it "broke" the blind evaluation model we'd carefully designed.

Turns out, a slightly worse AI that people actually use beats a perfect AI that sits unused because the UX is terrible.

What's Next

We're expanding our pilots and learning every day. The technology works. The question now is whether we can scale the human side—onboarding, support, training recruiters to trust but verify AI outputs.

I'm also watching the regulatory space closely. The EU AI Act classifies hiring tools as "high-risk AI systems." New York City requires bias audits for automated employment decision tools. This is good—high-stakes AI should be regulated.

But it also means we need to build compliance into the product from day one, not bolt it on later. Audit trails, explainability, human oversight—these aren't nice-to-haves. They're survival requirements.

If you're building AI products in regulated industries, start designing for compliance now. It's way easier than retrofitting later.

What Is JavaScript and How It Works in the Browser? (A Simple Guide)

2025-12-28 15:59:53

Introduction

If you’re starting your journey as a web developer, JavaScript is one word you’ll hear everywhere.

But beginners often ask:

  • What exactly is JavaScript?
  • How is it different from HTML and CSS?
  • Where does JavaScript actually run?
  • What happens inside the browser when JavaScript code executes?

In this blog, we’ll answer all these questions in simple language, without jargon, and with real-world clarity.

What Is JavaScript?

JavaScript (JS) is a programming language used to make websites interactive and dynamic.

Without JavaScript:

  • Websites would be static
  • Buttons wouldn’t respond
  • Forms wouldn’t validate
  • Pages wouldn’t update without refresh

Examples of what JavaScript does:

  • Show/hide elements on click
  • Validate form inputs
  • Fetch data from APIs
  • Build modern apps (React, Angular, Vue)

JS vs HTML vs CSS

HTML:

  • Defines what appears on the page
  • Headings, paragraphs, buttons, forms
<button>Click Me</button>

CSS:

  • Defines how it looks
  • Colors, layout, fonts, animations
button {
  background-color: blue;
}

JavaScript:

  • Defines how it behaves
  • What happens when you click the button
button.addEventListener("click", () => {
  alert("Button clicked!");
});

/*
  Adds event listener on the button.
  Alert is shown in the browser on button click
*/

All three work together, but JavaScript is what makes the page alive.

Where Does JavaScript Run?

JavaScript runs in two main environments:

  1. JavaScript in the Browser

    Every modern browser (Chrome, Edge, Firefox, Safari) has a JavaScript engine built into it.

    That means:

    • JavaScript runs inside the browser
    • No extra setup required
    • Perfect for UI, events, DOM manipulation

    Examples:

    • Button clicks
    • Page updates
    • Form validation

    This is called client-side JavaScript.

  2. JavaScript outside the Browser (Node.js)

    JavaScript can also run on the server using Node.js.

    With Node.js, JavaScript can:

    • Create servers
    • Access databases
    • Read/write files
    • Build APIs
    • Used for backend development

    Same language, different environment.

How JavaScript Works in the Browser?

Let’s understand what happens when the browser loads JavaScript.

  • Browser loads the HTML file
  • It sees a JavaScript file (.js)
  • The browser sends that JS code to the JavaScript engine
  • The engine reads and executes the code
  • JavaScript interacts with the webpage (DOM)

You don’t see this process, but it happens every time a page loads.

What Is a JavaScript Engine?

A JavaScript engine is a program inside the browser that:

  • Reads JavaScript code
  • Converts it into machine-understandable instructions
  • Executes it line by line

Every browser has one:

  • Chrome / Edge - V8
  • Firefox - SpiderMonkey
  • Safari - JavaScriptCore

You don’t need to install anything — it’s built in.

Conclusion

Let’s quickly recap:

  • JavaScript makes websites interactive
  • HTML gives structure, CSS gives style, JS gives behavior
  • JavaScript runs:
    • In the browser (frontend)
    • In Node.js (backend)
  • A JavaScript engine (like V8) executes your code behind the scenes

If you’re starting JavaScript, this understanding will make everything else easier.

References
MDN Web Docs – DOM Introduction
V8 JavaScript Engine

Learning Algorithms by Watching Them Run (A Visual Walkthrough with Learn-Algo)

2025-12-28 15:58:37

Most of us learned algorithms the same way:

  • Read a definition
  • Look at pseudocode
  • Try to memorize the steps
  • Hope it “clicks” later

For simple cases, that works. But once you hit sorting edge cases, recursion, trees, or ML concepts, things get fuzzy fast.

I built Learn-Algo to fix exactly that problem.

👉 https://learn-algo.com/

Instead of reading about algorithms, Learn-Algo lets you watch them execute step by step, pause them, rewind them, and experiment with inputs — the same way you’d debug real code.

In this post, I’ll walk through:

  • How Learn-Algo visualizes algorithms internally
  • A concrete example (sorting / traversal / ML flow)
  • Why visual execution leads to better algorithm intuition

No theory overload. No math walls. Just how algorithms actually behave.

Why “Seeing the Algorithm” Changes Everything

Algorithms aren’t static — they’re processes.

When we only read code like this:

for i in range(n):
  for j in range(0, n - i - 1):
    if arr[j] > arr[j + 1]:
      swap(arr[j], arr[j + 1])

we have to mentally simulate:

  • Comparisons
  • Swaps
  • Loop boundaries
  • State changes

That mental simulation is the hard part.

Learn-Algo offloads that cognitive load by rendering each step visually:

  • Which elements are compared
  • Which values move
  • How many operations actually occur

You stop guessing and start observing.

A Quick Walkthrough: Understanding Sorting Visually

Let’s take a simple example.

When you open a sorting algorithm in Learn-Algo, you don’t just click “Run”.

You can:

  • Choose or generate input data
  • Start execution step by step
  • Pause after each comparison or swap
  • Replay specific moments

As the algorithm runs, you see:

  • Active indices highlighted
  • Swaps animated
  • Progress across iterations

This instantly answers questions like:

  • Why is this algorithm slow for large inputs?
  • Where does the extra time complexity come from?
  • What changes when input is nearly sorted?

These are things most tutorials say, but rarely show.

From DSA to ML: Same Visual Philosophy

The same idea applies beyond classic DSA.

For machine learning concepts like:

  • Linear regression
  • Clustering
  • Optimization

Learn-Algo visualizes:

  • How data points move
  • How models adjust step by step
  • What “convergence” actually looks like

This is especially helpful if you’re coming from a programming background and find ML math intimidating at first.

Who This Walkthrough Is For

This walkthrough is for you if:

  • You understand syntax but struggle with intuition
  • You’ve memorized algorithms but can’t explain them
  • You’re preparing for interviews and want deeper clarity
  • You learn better by doing than by reading

You don’t need advanced math or deep CS theory to get value — just curiosity.

If algorithms ever felt abstract or “magical”, this is about making them predictable and understandable.

👉 Explore the playground: https://learn-algo.com/