CPU MODES EXPLAINED: PART 1 – REAL MODE
(THE WILD WEST OF COMPUTING)
Let’s separate the CPU operation modes to this document.
⚙️ WHAT EVEN IS A CPU MODE?
Before we get deep, here’s the thing: A CPU mode is like the operating style of the processor.
It defines what the CPU is allowed to do:
How much memory it can touch.
Whether it has access control/security.
What kind of instructions and registers it can use.
It’s like your CPU switching between “beginner,” “intermediate,” and “pro” modes depending on what it’s trying to run.
🧱 REAL MODE: THE 16-BIT LEGACY ARENA
This is the original mode of the x86 CPU family.
Born with the 8086 processor (1978), and every modern x86 CPU still starts in Real Mode when powered on, even your Core i9 or Ryzen 9. This means:
Even though your modern CPU has billions of transistors and can process trillions of operations a second, for the first few milliseconds after you hit the power button, it’s essentially pretending to be a 48-year-old chip with a whopping 1 MB of addressable memory.
In 1978, the 8086 used a 20-bit bus, meaning it could only see up to 220 bytes (1 MB) of RAM.
To maintain perfect backward compatibility, Intel designed every subsequent chip to mimic this limitation at startup.
When the 80286 came out, it could address more memory, but some old programs relied on a quirk where memory addresses wrapped around past 1 MB.
Engineers added the A20 Gate.
This was a physical switch that literally disabled the 21st address line to keep the CPU dumb enough to run legacy software.
Even today, your OS has to explicitly enable this line to escape the 1970s.
🍰 THE BOOTSTRAPPING RELAY RACE
Your CPU doesn't just jump into Windows or Linux. It performs a high-stakes evolution in a matter of frames:
Real Mode: The CPU wakes up. It can only see 1 MB of RAM and uses segmentation (combining two 16-bit numbers to find a memory address). It looks for the BIOS/UEFI at a hardcoded location called the Reset Vector.
Protected Mode (32-bit): The bootloader switches the CPU into Protected Mode. Suddenly, it can see 4 GB of RAM, use hardware-level memory protection, and handle multitasking. This was the peak of the 80386 era.
Long Mode (64-bit): Finally, the kernel switches the CPU into Long Mode. This unlocks the full 64-bit instruction set and the massive terabytes of RAM we use today.
You might wonder why Intel or AMD doesn't just break the past and start in 64-bit mode.
The x86 architecture’s greatest strength is that, theoretically, you could take a binary file compiled in 1979 and it would still execute on a 2026 processor.
Starting in a known, simple state (Real Mode) ensures that every motherboard manufacturer, BIOS developer, and OS coder has the exact same starting line, regardless of how simple or complex the underlying hardware becomes.
The X86-S Future: Interestingly, the industry is finally trying to move on. Intel recently proposed a new specification called x86-S (Simplified). This would finally strip away 16-bit and 32-bit legacy support, forcing the CPU to boot directly into a 64-bit state. It would be the biggest house cleaning in the history of computing.
🔑 KEY TRAITS OF REAL MODE
🧪 ADDRESSING STYLE
Real Mode uses Segment:Offset addressing. It breaks up memory access like this:
Basically: Segment × 16 (or left-shift 4 bits) + Offset.
That’s how Real Mode squeezes 20-bit memory access out of 16-bit registers.
I know you didn’t get anything, lets revisit this madness about 16-bit real mode.
Okay, I already made the image and html for you to go read, this is too hard to just write it out here.
You’ll not meet this stuff a lot, this is just for the old systems, for understanding.
🎮 WHERE YOU’LL STILL SEE REAL MODE IN ACTION
🚫 WHY MODERN OSES ABANDONED REAL MODE
⚠️ That’s why modern OSes like Windows 10/11 or modern Linux don’t allow 16-bit Real Mode programs to run natively anymore. You need emulators or virtual machines.
Analogy Time:
Summary: Real Mode
✅ 16-bit legacy mode — max 1MB memory
✅ No protection, no multitasking
✅ Still used in BIOS, bootloaders, and tiny embedded systems
❌ Not suitable for modern multitasking OSes
❌ Needs emulation on modern 64-bit systems
Let’s go to 32-bit. Remember, we’re using the 007 html file, just expanding that one for maximum impact and understanding.
Let's render the html 007 file from my Github here:
🛡️ 32-BIT PROTECTED MODE – THE SECURE APARTMENT BUILDING OF COMPUTING
What is Protected Mode?
Protected Mode was a game-changer when it dropped with the Intel 80386 processor.
This mode introduced true multitasking, memory protection, and virtual memory — which are core features of every modern OS.
Imagine going from a wild jungle (Real Mode) to a secure, gated apartment complex where every resident (program) has their own key, walls, and alarm system.
Key Features of Protected Mode
Memory Protection: Each program runs in its own isolated memory space, so if it tries to access memory it doesn’t own, it crashes without affecting other programs or the operating system.
🌐 Virtual Memory: Every application is given the illusion of having access to a full 4GB (or more) of memory, even if the physical RAM is smaller. The operating system makes this possible by using disk space as overflow, through a technique called paging.
🔄 Multitasking: The CPU can rapidly switch between multiple programs or tasks, allowing you to run things like Chrome, Spotify, and Visual Studio simultaneously without conflict.
🧩 Privilege Levels (Rings): The CPU enforces a hardware-based separation between user-mode (applications) and kernel-mode (the OS). This ensures that applications cannot directly interfere with or compromise the operating system.
📦 Flat Memory Model Support: Although segmentation still technically exists, modern systems often use a flat memory model where memory is accessed linearly, byte by byte, making addressing simpler and more intuitive.
Where Protected Mode Is Used Today (And why):
❌ Windows 32-bit Operating Systems like Windows XP, Vista, 7, 8, and 10 (32-bit editions) rely entirely on Protected Mode to function.
🎮 Older games and applications from the 2000s were mostly compiled as 32-bit programs, which means they still run perfectly well in Protected Mode environments.
💡 WoW64 (Windows-on-Windows 64-bit) allows modern 64-bit versions of Windows to run older 32-bit applications by emulating a Protected Mode environment for compatibility.
🐧 32-bit Linux distributions, such as Ubuntu x86, older versions of Raspberry Pi OS, and many embedded Linux systems, still use Protected Mode under the hood.
🔧 MASM and NASM tutorials often teach Protected Mode (32-bit assembly) first because it's cleaner, simpler, and requires less setup than diving straight into 64-bit assembly.
🖱️ Legacy drivers and low-level tools are still sometimes compiled in 32-bit mode, even on modern systems, to ensure compatibility with older hardware or software layers.
💡 Why 32-bit Protected Mode Was Such a Leap:
Before Protected Mode, you had:
No app isolation
No memory management
No multitasking
No security
After Protected Mode, you could have a full OS with apps crashing independently, virtual RAM, security per program, and multitasking.
That’s why OSes like Windows NT, Windows 95, and modern Linux were only possible with this mode.
💾 Registers in Protected Mode
You gain access to extended 32-bit registers:
Also, segmentation is still there (DS, CS, ES, etc.), but most tutorials flatten it for simplicity. Example:
This assembly snippet moves the hexadecimal value 0x12345678 into the 32-bit EAX register, then adds 42 to it.
Both instructions operate directly on 32-bit data, which is standard in protected mode environments.
In 32-bit protected mode, registers like EAX, EBX, and ECX are designed to handle 32-bit values, and memory addressing is structured around these 32-bit operations — making this kind of code the norm for systems like 32-bit Windows and Linux.
REGISTER SIZES (X86/X86-64 ARCHITECTURE)
Before we continue, let’s address a small issue here:
8-bit registers: AL, AH, BL, BH, CL, CH, DL, DH
Can hold values from 0x00 to 0xFF (0 to 255 unsigned, or -128 to 127 signed). Example:
16-bit registers: AX, BX, CX, DX, SI, DI, SP, BP
Can hold values from 0x0000 to 0xFFFF (0 to 65,535 unsigned, or -32,768 to 32,767 signed). Example:
32-bit registers: EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP
Can hold values from 0x00000000 to 0xFFFFFFFF (0 to 4,294,967,295 unsigned, or -2,147,483,648 to 2,147,483,647 signed). Example:
64-bit registers: RAX, RBX, RCX, RDX, RSI, RDI, RSP, RBP
Can hold values from 0x0000000000000000 to 0xFFFFFFFFFFFFFFFF
(0 to 18,446,744,073,709,551,615 unsigned, or -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 signed).
Example:
🧊 CHECKING IF A VALUE FITS
Hexadecimal digits per register size:
8-bit: 2 hex digits (e.g., 0x12).
16-bit: 4 hex digits (e.g., 0x1234).
32-bit: 8 hex digits (e.g., 0x12345678).
64-bit: 16 hex digits (e.g., 0x123456789ABCDEF).
Example:
mov eax, 0x12345678 → Valid (8 hex digits, fits 32-bit).
mov eax, 0x123456789 → Invalid (9 hex digits, exceeds 32-bit). It won’t fit — the CPU will take only the lower 4 bytes – TRUNCATION.
mov rax, 0x123456789 → Valid (fits in 64-bit).
What Happens if You Exceed the Limit?
Most assemblers (like NASM, MASM, FASM, GAS) will throw an error if you try to move a too-large value into a register.
Example (NASM error):
Truncation
Some assemblers might truncate the value (keep only the lowest bits), but this is not reliable and should be avoided.
Say you do:
It won’t fit — the CPU will take only the lower 4 bytes (last 8 hex digits):
0x12345678922 → gets truncated to 0x345678922 → then only the last 8 digits → 0x45678922 (lower 32 bits).
The extra 0x1 at the beginning gets cut off silently.
It’s like trying to pour 1.5 liters of soda into a 1-liter bottle.
The rest just spills.
🦾 What Happens if You Use a Smaller Register?
Same logic, smaller limit:
AX is only 16 bits → only takes last 4 hex digits.
🔥 Cheat Sheet: How Many Hex Digits per Register?
For the confused backbenchers, lets fix you:
✅ One Byte = 8 Bits = 2 Hex Digits
A byte holds 8 bits. Each hex digit represents 4 bits (aka a nibble).
“If my value is 2 hex digits (like 0x7F, 0x22, 0xB4), it fits in a byte.”
⚠️ Don't Confuse with Decimal
Some decimal numbers look small but still take more than 1 byte:
200 (decimal) = 0xC8 → ✅ fits
300 (decimal) = 0x12C → ❌ 3 hex digits → needs 2 bytes
💥 Special Cases:
Sign Extension: If you move a smaller value (e.g., mov eax, -1) into a larger register (e.g., rax), the value is sign-extended.
Zero Extension: Moving unsigned values (e.g., movzx eax, al) fills upper bits with zeros.
We’ll see these in future topics.
✅ Key Takeaway
Count the hex digits to ensure the value fits the register.
Assemblers will warn you if the value is too large.
Reverse Engineering Tip: When analyzing code, check the register size to understand how much data is being manipulated.
1 byte = 2 hex digits (a nibble each).
AL / AH can store anything up to 0xFF.
When writing hex, count the digits to know how many bytes you're dealing with.
Don’t confuse hex with regular base-10 numbers.
⚠️ 32-bit protected mode is Legacy today, but still essential for Reverse Engineering and Kernel work… etc
🌐 64-BIT LONG MODE – THE SKYSCRAPER CITY OF MODERN CPUS
📏 What is Long Mode?
Welcome to the current era of computing. Long Mode is how modern 64-bit CPUs run your operating systems and apps today.
Introduced with AMD64 (yup, AMD beat Intel here), this mode unlocked way more RAM, better performance, and modern security features, without throwing away what made Protected Mode great.
Think of Long Mode like a future-proof skyscraper city:
Massive vertical space (more memory), more elevators (registers), and smarter infrastructure (paging, security).
📦 Long Mode = Protected Mode ++
Technically, Long Mode is a supercharged version of Protected Mode.
It still supports:
Paging (virtual memory)
User/kernel isolation
Multitasking
...but adds 64-bit registers and 64-bit address spaces.
⚙️ Key Features of Long Mode:
🧠 64-bit Registers: In Long Mode, traditional 32-bit registers like EAX, EBX, and ECX are replaced with their 64-bit counterparts — RAX, RBX, RCX, etc. In addition, the architecture introduces eight brand-new general-purpose registers: R8 through R15, giving developers more flexibility and faster data handling.
💾 Huge Address Space: Long Mode unlocks a theoretical memory address space of up to 16 exabytes (that’s 18,446,744,073,709,551,616 bytes). In practice, most modern CPUs support up to 256 TB of addressable space, which is still astronomically higher than 32-bit limits.
💨 Faster Performance: With more registers and wider 64-bit data paths, CPUs in Long Mode can handle larger numbers and datasets more efficiently — which means faster calculations, better multitasking, and improved performance for heavy applications.
🧱 RIP-Relative Addressing: Long Mode introduces RIP-relative addressing, which allows code to access memory locations relative to the current instruction pointer (RIP). This makes position-independent code (PIC) easier to write and more secure — something modern operating systems rely on for features like shared libraries and code randomization.
🔐 Stronger Isolation and Security: Long Mode supports a hardened separation between kernel and user space, along with advanced security features like the NX (No-eXecute) bit, ASLR (Address Space Layout Randomization), and SMEP (Supervisor Mode Execution Prevention). These features work together to protect against modern memory-based attacks and vulnerabilities.
💻 Real-World Use Cases of 64-bit Long Mode (a.k.a. Where It’s Actually Used)
Pretty much every current OS - Windows 10, Windows 11, macOS, and modern Linux distros, runs entirely in 64-bit Long Mode. If your computer is less than 15 years old, you're already living in it.
Heavy-Hitter Apps (Video, Databases, etc.): Apps like video editors, big databases, and 3D rendering engines need access to more than 4GB of RAM — which 32-bit systems just can't handle. Long Mode makes that possible.
Modern Games: Games today eat RAM like snacks. 8GB+ is standard, 16GB+ is common, and that means they have to be 64-bit. Most AAA titles won’t even launch in a 32-bit world.
Scientific Computing & Machine Learning: When you’re working with huge arrays, neural networks, or massive datasets, 32-bit systems just tap out. Long Mode opens the door for processing at scale: think AI, simulations, bioinformatics, physics engines, all that stuff.
Malware (and Anti-Malware): Modern malware is built to target 64-bit OSes, and defenders (a.k.a. reverse engineers like you) need 64-bit tools to analyze and unpack them. Long Mode isn’t just for legit programs, it’s the battlefield for digital warfare.
Reverse Engineering EXEs: Most executables on a 64-bit Windows system use the PE64 format (Portable Executable, 64-bit). If you're cracking, tracing, or dissecting apps, you gotta know how 64-bit registers, memory layout, and instructions work, or you'll be totally lost.
📏 Register Breakdown in Long Mode
In Protected Mode (32-bit), you had:
Now in Long Mode (64-bit), you’ve got:
Here are their full names:
🧱 RBX – The Extended Base Register
Used for holding base addresses in memory.
Think: a pointer to the start of your giant data structure — like the foundation of a skyscraper.
🔁 RCX – The Extended Count Register
Used in loops, counts, and string operations.
Think: a digital clicker counting how many reps your CPU has left to do.
📤 RDX – The Extended Data Register
Handles I/O and large-number math.
Think: your CPU’s multipurpose toolbelt — for division, data transfer, etc.
📦 RSI – The Extended Source Index
Points to where data is coming from (like for string/memory ops).
Think: a chef’s hand reaching into the pantry — grabbing the source.
📥 RDI – The Extended Destination Index
Points to where data is going.
Think: that same chef dumping the food into a bowl — the destination.
📚 RSP – The Extended Stack Pointer
Always points to the top of the stack.
Think: a stack of plates — this register tracks the one on top.
📌 RBP – The Extended Base Pointer
Used to anchor the current function’s stack frame.
Think: a fixed bookmark inside your temporary memory, pointing to where local variables live.
And yes, each of these can be broken down further:
So, you still get backward compatibility with older 32-bit and 16-bit code, but now with way more horsepower.
💥 R8 to R15 – The New Recruits (64-bit Only)
When CPUs evolved from 32-bit to 64-bit, they didn’t just stretch existing registers (like EAX → RAX).
When we made the jump to 64-bit, Intel said:
“You know what? 8 general-purpose registers just ain’t enough anymore.”
So, they gave us 8 more: R8 to R15.
These are full 64-bit general-purpose registers, just like RAX, RBX, etc. — but exclusively available in 64-bit mode (Long Mode). You won’t see these in 32-bit assembly at all.
What They're Used For:
• Used heavily in function parameter passing (the Windows/Linux 64-bit calling conventions rely on them)
• Great for extra temporary storage when your code needs more than the classic 8 registers
• Super handy in loop unrolling, SIMD routines, or low-level optimization
• You’ll see malware, obfuscators, and compilers use them for sneaky tricks or performance
So instead of:
We now get:
That’s 16 total general-purpose registers in 64-bit mode. Huge boost.
🎮 Why do we care about R8–R15?
1. Function Parameter Passing in 64-bit Linux (System V ABI)
When you call a function in 64-bit Linux (or compile with GCC, Clang, etc.), the first six arguments are passed using registers (not on the stack like in 32-bit).
The order is:
This table shows a common calling convention, specifically for Linux (and other Unix-like systems) on x86-64 architecture, often referred to as the System V AMD64 ABI.
For the first six arguments, it prioritizes using specific general-purpose registers, including the new registers like R8 and R9, to pass data directly to a function, which is much faster than pushing them onto the stack.
Example:
That’s why R8 and R9 aren’t optional weird extras, they are baked into how functions work in 64-bit!
📬 What about Windows?
In Windows 64-bit (Microsoft x64 calling convention), it’s a little different:
So, in both Linux and Windows, R8 and R9 are used early in parameter passing.
🧩 Sub-registers of R8–R15
Just like how RAX has smaller siblings:
EAX (32-bit)
AX (16-bit)
AL (8-bit low)
AH (8-bit high)
The new registers R8–R15 also have sub-registers:
In 64-bit systems, Intel introduced a set of new general-purpose registers: R8 through R15.
Just like the older AX, BX, CX, DX registers, these new 64-bit registers also have sub-registers that allow you to access smaller portions of their data (32-bit, 16-bit, and 8-bit parts).
✅ So yes, you can move 8-bit values into R11B, 16-bit values into R12W, 32-bit values into R9D, and so on.
✅ This works just like how you'd use AL (8-bit), AX (16-bit), or EAX (32-bit) with the legacy RAX register.
✅ This flexibility allows for efficient manipulation of data of different sizes within the larger 64-bit registers.
🧪 Real Usage
Example 1 – Simple data move:
Example 2 – Passing function args in Linux:
🎯 Why This Matters for You:
If you're writing shellcode, reversing malware, or working on system-level C or C++, you must understand how args are passed.
If you're building a compiler, parser, or learning ABI design – this is ground truth.
If you're debugging a crash and see R8 = 0x0BADF00D – you now know it might be parameter 5.
💻 TLDR – R8 to R15 in 64-bit Assembly (Cleaned Up)
R8 to R15 are extra general-purpose registers introduced in 64-bit mode, they don’t exist at all in 32-bit Protected Mode. These registers give you more firepower for handling data, optimizing performance, and passing function arguments.
In the 64-bit calling convention (especially on Linux and Windows), they help carry function arguments:
🧾 Example: R8 and R9 come in right after RCX, RDX, RSI, and RDI.
💻 So, if you're writing shellcode, reversing binaries, or tracing sys-calls, you need to know their role.
Just like RAX breaks into EAX → AX → AL, these registers have sub-registers too:
R8D–R15D → 32-bit
R8W–R15W → 16-bit
R8B–R15B → 8-bit
Why it matters: Long Mode didn’t just make registers bigger — it added more.
More registers = more freedom, more complexity, more control.
If you’re in 64-bit land, these are not optional knowledge. Period.
📝 Memory Access in Long Mode
You now have:
64-bit flat address space
Paging with 4 levels (PML4) to map virtual addresses
Still no segmentation like in Real Mode, segmentation is mostly disabled (yay, simplicity!)
That’s why most 64-bit assembly tutorials say: "Forget segments. Think in pages."
Why 64-bit Mode Isn’t Always Taught First – Painful AF
🤯 It's more complex under the hood:
System calls don’t work the same — you can't just drop a casual int 0x80 anymore like it’s 2003. Instead, 64-bit uses a totally different ABI (Application Binary Interface), and the registers behave differently. The rules changed, and you gotta learn the new playbook.
💻 Debugging is trickier:
You’re now juggling wider 64-bit registers like RAX, dealing with RIP-relative addressing (yeah, your instructions reference memory based on the current address), and following new calling conventions. It’s like graduating from checkers to 4D chess.
📉 Less hand-holding for beginners:
Most tutorials out there still cling to 32-bit because it's simpler and easier to teach. That means you’ll find fewer guides, fewer StackOverflow answers, and more “figure it out yourself” moments. But hey…
IS ASSEMBLY LANGUAGE PORTABLE?
Short answer: Nope. Not even a little.
But let’s unpack it properly:
🧳 What is Portability in Programming?
A portable language means:
You write code once 🧑💻
It compiles and runs on different platforms 🖥️💻📱
You don't have to rewrite everything for each system
Languages like C++, Golang and Java are known for their portability:
C++ can compile on many systems (Windows, Linux, macOS), as long as you avoid system-specific features.
Java goes a step further: its compiled .class files run on any machine with a Java Virtual Machine (JVM). Write once, run anywhere.
But Assembly? Nah.🛑
Assembly is tied directly to the CPU architecture.
Your .asm file written for x86 (Intel/AMD 32-bit) won’t run on ARM (used in most phones), MIPS, or even x64 without major rewrites.
Even different assemblers (MASM vs NASM vs GAS) have different syntax, so there's no one universal assembly language eg Python 3 says print("Hello world") everywhere, even in linux, every assembler requires its own unique assembly language. See this image...
❌ Why is Assembly So Inflexible?
It talks directly to the hardware.
It uses CPU-specific instructions.
It relies on things like register names, stack conventions, and memory layout that vary per system.
✅ But Here's the Tradeoff:
Assembly gives you max control over what your program does: no layers, no abstractions.
That’s why it’s still used in:
Embedded systems
Operating system kernels
Bootloaders
Malware and exploit development
Speed-critical functions inside modern apps
🔄 Why C/C++ Are "In-Between" Languages:
C and C++ give you low-level power (pointers, memory manipulation) without sacrificing portability.
You can write fast, near-hardware code in C...
...but still compile it for Windows, Linux, ARM, x86, etc. (as long as you don't use platform-specific libraries).
⚠️ Caveat:
That low-level power (e.g., using pointers to access hardware memory) isn’t portable, because it assumes knowledge of the machine’s architecture.
If these htmls are not able to render, you can find them in my github notes.
🧬 TLDR – Assembly vs C vs Java:
ACCESSING MEMORY INFLUENCES PORTABILITY
Let’s discuss this part. This is where a lot of people (even pros) misunderstand portability.
What Does It Really Mean to "Access Hardware with Pointers"?
In C or C++, you can write things like:
Here’s what that code is trying to do:
You’re saying: “Hey C, treat the memory at address 0xB8000 like it holds an integer.”
Then you write a value to that exact physical memory address.
This is direct hardware access — you’re not asking the OS for permission. You're going straight to the metal.
That specific address 0xB8000?
On old PCs, that pointed to video memory (text mode on VGA screens).
So, writing to that memory would literally change what’s shown on the screen.
⚠️ Why Is That Not Portable?
Because that memory address only means something on certain hardware, with a specific OS, under a specific configuration.
Let's say:
On your PC, 0xB8000 = video memory.
On a Raspberry Pi? 💥 That address may not even be mapped!
On a Mac? ❌ Nope.
On modern Windows in protected mode? ❌ Blocked entirely — you’ll get an access violation.
On Linux with memory protection? ❌ OS will stop you.
So, while the C code is valid everywhere, the meaning of what it does completely breaks if you're not on the same low-level architecture.
Portable vs Non-Portable Code in C
✅ Portable Example:
This will work on any machine with a C compiler, no hardware-specific stuff involved.
❌ Non-Portable Example:
This assumes the serial port is mapped to address 0x3F8, true on legacy IBM PC architecture, but absolutely not guaranteed anywhere else.
Why This Matters
High-level code is like: “OS, please print this text.”
Low-level code is like: “I’m writing directly to memory address 0xB8000. Don’t ask questions.”
If that address doesn’t do what you expect on another system, or the OS won’t let you touch it, your program crashes, or worse, does nothing.
If you try to run Windows code on an Android phone using the Coding C from playstore, the app will crash or give you an error because it doesn't recognize Windows-specific files like windows.h.
Why some C code only works on one computer
Even though C is a famous language, it isn't always "one size fits all." Here is why:
System-Specific Files: When you use "WinAPI," you are using tools made only for Windows. If you try to run that on a phone (Android) or a Mac, the computer won't find those tools. That is why you see errors like windows.h not found.
Hardcoding Memory: If you tell your code to go to a specific memory address (like a house address), it might work on your PC. But on a different device, that "house" might not exist or might belong to someone else. The system will block you for safety.
The CPU Accent: When you write code this way, you aren't writing Universal C anymore. You are writing code that is hugging the hardware too tightly.
What is Cassembly? (Made up word, but lets work with it)
Think of Cassembly as C code that has a heavy CPU accent.
It looks like C, but it behaves like Assembly.
It is very powerful because it talks directly to the brain of the computer, but it is non-portable.
This means if you move the code to a different type of device, it breaks immediately.
It's like trying to use a US power plug in a European outlet, the language is slightly different, so it just won't plug in!
If you didn’t laugh at my Cassembly joke, just go start at chapter 1 please… You’re my lost sheep.
FAMOUS C HARDWARE TRICKS (A.K.A. CASSEMBLY MOVES)
These are the OG tricks C programmers used to talk directly to hardware, fast, dirty, powerful…
but wildly non-portable.
1. Direct Video Memory Writing
💽 What it did back then:
Writes the character 'A' to the top-left corner of the screen in text mode.
0xB8000 was the start of the video buffer on old x86 machines (VGA text mode).
💥 Why it breaks now:
Modern OSes (Windows, Linux) don’t let you directly write to video memory.
You’ll get a segmentation fault or access violation.
This only works under DOS or a protected bare-metal environment
2. Triggering the PC Speaker (Beep!)
💽 What it did:
Played a sound through the PC speaker by manipulating the Programmable Interval Timer (PIT) and speaker control register (I/O port 0x61).
💥 Why it breaks:
inb() and outb() are low-level assembly-like instructions.
Needs kernel-level privileges — user-mode apps can’t do this anymore.
On modern systems, access to I/O ports is blocked unless you’re in kernel or using a driver.
3. Accessing CMOS/BIOS Data
💽 What it did:
Pulled hardware data (like system time) directly from CMOS.
You’re basically talking to the BIOS firmware directly.
💥 Why it breaks:
Direct port I/O isn't allowed in protected/user mode.
You need root-level access, kernel modules, or special drivers.
OSes abstract this behind proper APIs now (e.g., time.h in C).
4. Writing to Segment Registers (like FS/GS)
💽 What it did:
Accessed segment registers for thread-local storage or direct memory addressing.
💥 Why it breaks:
Segment registers work very differently in 64-bit mode.
Direct access is blocked or repurposed (e.g., FS/GS in Windows are used for TLS).
You can't just poke these anymore without triggering exceptions.
5. Writing Your Own Interrupt Handler
💽 What it did:
Replaced hardware interrupt vectors with your own handlers.
Used for keyboard hooks, mouse input, or custom drivers in DOS.
💥 Why it breaks:
Totally forbidden in modern OSes.
Protected mode + multitasking OS = kernel handles interrupts now.
You’d need to write a kernel driver to do this on Windows/Linux.
6. Bottom Line
These C tricks:
Worked great in DOS or embedded bare-metal
Fail hard on modern OSes
Were basically assembly disguised as C
That’s Cassembly in action: Fast, dangerous, thrilling... and completely non-portable.
WHY ASSEMBLY (AND LOW-LEVEL C) ISN’T PORTABLE
When you write C or Assembly that talks directly to hardware or memory addresses — like poking a specific I/O port or writing to a fixed memory location — it might work beautifully on your system...
But take that same code to another machine? 💥 Crash. Burn. Undefined behavior.
⚠️ Here’s Why:
Different systems have different memory layouts (RAM, ROM, mapped devices).
What’s safe on one CPU can be dangerous on another.
That address you wrote to? Might not even exist on a new motherboard or OS.
Now let’s say you were being a boss in C:
On your retro dev board? Might write to a screen buffer.
On a modern Linux laptop? Segmentation fault.
On a microcontroller? Maybe it resets your CPU. Who knows. 🧨
🔐 Security & Portability: Why We Don’t Do That Anymore (Usually)
In modern systems:
Direct hardware access = blocked (by the OS or CPU).
Random memory access = forbidden unless you’re writing a kernel driver or OS component.
To stay safe and portable, modern C/C++ code uses:
Standard Libraries – like stdio.h, stdlib.h, fopen() instead of talking to disk I/O directly.
System Calls/APIs – abstracted OS-level functions that handle hardware safely.
These give you a standard interface that works on Windows, Linux, macOS, etc.
🔌 BUT… C/C++ Still Lets You Plug into the Matrix
Many C/C++ compilers let you mix in inline assembly (__asm__) or write separate .asm files.
That’s the hybrid zone — high-level comfort, low-level power.
This is where C starts sounding like:
“You’re basically writing Cassembly — C with bare metal energy. ⚡”
HOW ASSEMBLERS AND LINKERS WORK TOGETHER
Let’s simplify the whole flow of building an executable from .asm source code:
🏗️ Step-by-Step Pipeline
Assembler: Converts Assembly to Object File (.OBJ)
Reads your .asm file.
Translates mnemonics (MOV, ADD, etc.) into machine code (binary instructions).
Generates a .obj file (not runnable yet).
Adds relocation info – placeholders for stuff like:
“Jump to that function later”
“This variable will be defined elsewhere”
“We’ll plug in the correct address later”
Linker: Combines Object Files into Executable (.EXE)
Takes one or more .obj files and library code.
Resolves all external references: Fills in the correct memory addresses for jumps, calls, symbols.
Handles libraries e.g. If you use printf, the linker connects your code to the standard C library version of it.
May also:
Merge duplicate sections
Optimize memory layout
Create program headers and relocation tables
Final Result: Executable File
Can be run by the OS.
Has all addresses fixed up, everything packed and ready.
Summary – The Whole Flow:
This process, from writing hardware-aware C/Assembly code, to compiling, assembling, and linking, is what gives you full control over the machine... but only if you respect the rules of the hardware you're targeting.
Summary of What We Already Handled:
✅ Memory Architecture Differences:
Different systems have different memory layouts.
Direct memory access (like poking address 0xB8000 for video) might crash or misbehave on systems with a different architecture.
🔐 Security & Safety Limits:
C/C++ lets you touch raw memory (via pointers), but OSes and runtime environments (especially modern ones) won’t always let you access those addresses directly.
Sandbox or protected environments (like in macOS or modern Linux distros) block or restrict direct hardware poking.
📦 Why Standard Libraries Exist:
To make C/C++ portable across systems, the languages offer system libraries (like stdio.h, stdlib.h, unistd.h, etc.) that abstract away low-level differences.
Instead of directly touching I/O ports or memory, you use those APIs and the OS handles the dirty work underneath.
🔧 Inline Assembly Option:
If you really need hardware-level control (like writing device drivers or fast math), most compilers like GCC and MSVC let you embed inline assembly inside C/C++ code.
But that kills portability — so use it wisely and only when you need to go full savage mode. 💥
Assembly isn’t just nerdy ancient tech — it’s a secret key to really mastering operating systems.
ASSEMBLY LANGUAGE & OPERATING SYSTEMS
If you're serious about leveling up your OS knowledge — assembly isn't just useful... it's a requirement. Here's why:
1) Assembly Shows You the Exact Link Between Software and Hardware
High-level languages like C and Python abstract away the hardware.
Assembly lets you see what’s really happening when a program talks to the CPU, memory, or hardware.
Why does that matter for OS dev? Because operating systems are the bridge between hardware and user programs.
You start to realize that even high-level "OS features" like file systems and multitasking rely on raw instructions underneath.
🛠️ When you know assembly, you understand the guts of I/O, interrupts, device drivers — all the stuff OSes manage daily.
2) System Calls Aren’t Magic Anymore
Every time you use a function like printf(), read(), or malloc(), it eventually makes a system call.
A system call is like your program politely knocking on the OS’s door saying: “Hey kernel, I need help.”
Assembly shows you how that knock happens:
On Linux x86_64: it's mov rax, syscall_number → syscall
On Windows: it's often int 0x2e or syscall via special wrappers
Instead of just using system calls blindly, you start to see how the OS traps into kernel mode, does work, then returns control.
🔍 Knowing how syscalls are built and triggered is gold for reverse engineering, kernel hacking, or even writing your own OS.
3) You Learn How Memory Is Really Managed
Want to understand the stack? Heap? Segments? Paging? Virtual memory?
Assembly shows you all of it raw.
You’ll watch the stack pointer (RSP) move as functions are called.
You'll see how memory is addressed, aligned, allocated, or freed — not by magic, but by very specific CPU operations.
📦 Memory management is a foundational OS task. Assembly shows how malloc, stack overflows, and segmentation faults really happen.
4) Performance Optimization Hits Different
Ever wonder why some low-level functions are so fast or why loops slow down your app?
Assembly gives you direct access to the CPU’s power.
You can:
Eliminate unnecessary instructions
Use SIMD (like SSE or AVX) for vector math
Tune cache hits/misses by adjusting memory access patterns
⚡ The OS scheduler, the memory allocator, and I/O subsystems are all performance-critical — and often written in (or close to) assembly.
5) Security: Know the Exploits Before They Know You
Buffer overflows. ROP chains. Shellcode injection. Stack smashing.
These aren’t abstract bugs — they’re assembly-level manipulations of memory and control flow.
When you read disassembled malware or debug a crash, you’ll see the exploit happening in real time — only if you know assembly.
🛡️ Modern OS security starts with assembly: you can’t defend or patch what you don’t understand.
6. TLDR:
Learning assembly is like opening the back door into the OS. You see the CPU, memory, and system calls naked — no high-level sugarcoating. It's not optional if you want to:
Build an OS.
Write efficient kernel modules or drivers.
Reverse engineer and debug at the lowest levels.
Truly grasp how programs run and interact with hardware.
Assembly makes you dangerous (in a good way). 🔧💣
ONE-TO-MANY RELATIONSHIPS (High-Level vs. Low-Level)
What does one-to-many mean?
When you write a single line of high-level code (like in C, Java, or Python), that line may turn into multiple low-level instructions when compiled. That’s the one-to-many relationship in action.
Example:
You see one loop? The CPU sees:
Set i = 0
Compare i < 10
If not true → jump out
Execute body
Increment i
Jump back up
= multiple machine instructions.
🧩 Why?
High-level languages are like giving directions in full English:
“Drive 5 blocks, turn left, stop when you see the red house.”
Assembly and machine code are like telling a robot:
Move forward 5 units
Rotate 90°
Evaluate sensor for red pixel density
Halt if threshold reached
📌 So yeah: high-level is for humans. Machine code is for robots. Assembly is the go-between that speaks human-ish robot.
🌍 Portability
Portability = write once, run (almost) anywhere.
This means your program doesn't depend on the quirks of a specific CPU, OS, or hardware. The more portable your code is, the less painful it’ll be to move it between machines or systems.
Languages Known for Portability:
Java: Compile once, run anywhere (thanks to the JVM).
Python: Interpreted on any system with Python installed.
C++: Portable if you avoid system-specific stuff (e.g., Windows-only libraries).
C: Portable as source code across nearly any processor or operating system ever created, as compilers for C exist on almost every platform. Code must be recompiled for each target system, but the language itself offers the widest hardware reach, provided the code avoids platform-specific libraries.
C#: Offers binary portability via the .NET runtime, similar to how Java uses the JVM. With the modern cross-platform .NET Core (now just .NET), the same compiled intermediate bytecode can run on Windows, Linux, and macOS.
JavaScript: Achieves arguably the broadest practical reach due to its ubiquity as the language of web browsers, which are available on nearly all devices. Server-side platforms like Node.js also allow it to run on various operating systems, requiring the runtime environment to be installed on the host machine.
Go: Known for its simple, built-in cross-compilation capabilities. You can compile the same source code for different target operating systems (e.g., Windows, Linux, macOS) from your development machine with a simple command, producing a statically-linked native executable that includes its runtime and has minimal external dependencies.
Rust: Provides strong source-level portability across major platforms (Linux, Windows, macOS) and numerous architectures. It compiles to native machine code and benefits from a mature toolchain (LLVM) that supports a vast number of targets, allowing for performance comparable to C and C++ while maintaining safety guarantees.
PHP: As an interpreted scripting language (typically for server-side use), it offers high source-level portability. The same PHP code can run on any major operating system that has a PHP interpreter installed.
But...
Assembly is NOT portable❌. Its written specifically for a CPU’s instruction set (ISA). That means:
x86 Assembly won't run on ARM
Motorola 68k Assembly won’t run on VAX
Even x86 Assembly might differ a bit between 16-bit, 32-bit, and 64-bit modes
It’s like writing music for a piano and trying to play it on a trumpet, the notes don’t match.
💥 Key takeaways — as notes (high-level vs assembly)
1. Instruction Mapping:
In high-level languages, one line of code can turn into many machine instructions.
Example: for (i = 0; i < 10; i++) becomes multiple steps like init, compare, increment, jump, etc.
In assembly, each instruction is usually a direct 1-to-1 match with the CPU's machine instructions.
Example: MOV AX, BX → 1 machine instruction.
2. Portability:
High-level languages like C++, Python, Java are portable.
You can write once and run on many systems (as long as you don’t use OS-specific hacks).
Assembly is not portable. It’s tightly linked to the CPU architecture it was written for.
Write for x86, and it won’t work on ARM, VAX, or Motorola 68k, each has its own assembly language!
3. Syntax & Readability:
High-level languages are human-readable and abstract. You work with concepts like variables, loops, objects.
Assembly is low-level and technical. You deal with registers, memory addresses, and CPU-specific operations.
High-level: printf("Hello")
Assembly: push "Hello" → call write_string → interrupt or syscall
4. Purpose:
High-level is for writing apps, websites, APIs, stuff normal devs do.
Assembly is for low-level optimization, OS development, reverse engineering, malware analysis, and hardware hacking.