The Skyscraper Inside Your Program

srinath shrestha srinath shrestha

For any program that is running, the operating system gives it a slice of memory to work with. What the program sees is not the raw physical RAM, but a virtual address space — a private, continuous range of addresses that makes memory look clean and isolated.

Now, on a typical 64-bit system this address space is enormous (terabytes in theory), far larger than the physical RAM in the machine. In practice, current x86-64 CPU implement 48 bits virtual addressing, giving each process up to 256 TB of addressable space which is still large for any computer you can afford. OS and CPU’s memory-management unit, sort of work in tandem to map virtual addresses to real memory pages as needed, the catch is that only the portions actually used are backed by physical RAM, the rest may be mapped lazily, shared with other processes, or moved to disk through paging when memory pressure rises.

A virtual address itself is split into two logical parts. The upper bits form the virtual page number, which the hardware translates into a physical frame number. The lower bits are the page offset, which stays the same during translation and identifies the exact byte within that page. This paging mechanism lets programs see a smooth, continuous memory space while the OS quietly manages where data actually lives.

Inside that virtual address space, memory is divided into regions that serve different roles. You’ll commonly see:

  • a text (code) segment containing executable instructions
  • a read-only data section for constants and string literals
  • a data segment for global and static variables
  • a heap, used for dynamic allocation and able to grow during runtime
  • a stack, used for function calls and local variables

This is a simplified view; the actual layout you see in /proc/<pid>/maps is more granular

if you are wondering why, then these regions exist so the OS and hardware can enforce protection, manage growth, and optimize memory use.

Now the stack and heap grow toward each other in the address space, the stack typically starts near the high addresses and grows downward, while the heap starts lower and grows upward. The stack size is limited (often around 8 MB per thread on Linux and Windows by default), whereas the heap can expand as needed within process limits.

If you’re wondering why memory is organized into sections and how the OS manages this virtual space, that’s a separate rabbit hole you can explore it by reading about ELF file format, comparing Clang and GCC outputs, and learning how linker and loaders work. I too am exploring those topics and soon I’ll write a fun, easy-to-understand article without compromising the details, if you are wandering , Hey! why don't you know these already, then listen, Topics and concepts in CS are like fractals — the more you dig in, the more there is to be dug.

Apart from that , If you want the hardware side of address translation, paging, and memory hierarchy, that falls under Computer Organization and Architecture (COA); a clear overview is available : here

when talking about stacks , we can't let alone function call right , so when a function is called, a new stack frame is created in memory to store the function's parameters and local variables and other bookkeeping info. When the function returns, its entire stack frame is deallocated.

Now here’s something subtle but important: when a function returns, its stack memory doesn’t disappear — it simply becomes available for reuse. If you keep a pointer to that memory, you’re pointing to something that no longer belongs to you.

c
#include <stdio.h>

typedef struct {
  int x;
  int y;
} coord_t;

coord_t *new_coord(int x, int y) {
  coord_t c;      // lives on the stack
  c.x = x;
  c.y = y;
  return &c;      // returning address of a local variable [X]
}

int main() {
  coord_t *c1 = new_coord(10, 20);
  coord_t *c2 = new_coord(30, 40);
  coord_t *c3 = new_coord(50, 60);

  printf("%d %d\n", c1->x, c1->y);
  printf("%d %d\n", c2->x, c2->y);
  printf("%d %d\n", c3->x, c3->y);
}

What’s happening here is subtle: c lives inside the stack frame of new_coord(). When the function returns, that frame is popped and the memory becomes free for reuse. Each new call reuses the same stack slot, so all three pointers end up referring to the same location, which finally contains 50, 60. This is called a dangling pointer — a pointer that still holds an address, but the object it pointed to no longer exists. basically Stack objects die when the function returns. Pointers to them become ghosts.

Back to our Low level discussion, we have ~8MB of stack space in our virtual address space and local buffers live on the stack too, writing past their boundaries can corrupt adjacent stack data, which essentially is Buffer OverFlow.

speaking about overflow , we have a very common or i should say we had a (gets() was removed from the C standard & recommended to Use fgets() instead) very common bug in one of the method called gets() from the string lib from c std lib, which essentially allows the users or programmer to sort of write more bytes to the a variable exceeding it's predefined limit , through this a hacker can take full control.

c
#include <stdio.h>
#include <string.h> 
int main() { 
    volatile int is_admin = 0; // address 100; 
    char name[8]; // 92 
    printf("Enter your name: "); 
    gets(name); 
    printf("\nDEBUG: name is at %p, is_admin is at %p\n", name, &is_admin);
    printf("DEBUG: is_admin value is currently: %d\n", is_admin); 
    if (is_admin != 0) { 
        printf("--------------------------------------\n"); 
        printf("ACCESS GRANTED. Welcome, Administrator.\n"); 
        printf("--------------------------------------\n"); 
    } else { 
        printf("Access Denied. You are a mere mortal.\n");
    } 
    return 0; 
}

in this code , the input we are taking can potentially overwrite the values in the is_admin, through buffer overflow , it the same as how sprintf writes blindly into memory thus we are suggested to use snprintf which takes limit(n) to know it's boundary, similarly we have scanf("%s", name) ,strcpy() and even read() which are mention in the context of buffer overflow, if you are wandering, what bad can happen if our name buffer DO overwrites is_Admin? then i'll ask you to think twice about it , we have a if condition that is guarding the code inside it through a simple check, that our is_admin has to be a non zero value, so with this user can flip the value of is_admin essentially causing a privilege bypass. That's not all , hacker can overwrite the stack address and put a pointer to some new machine code, exactly this is crazy right ! well i always say this , hacking begins when we stop thing normally and push the limits,

stack_skyscraper_analogy
stack skyscraper analogy

lets better understand this , imagine the dynamic section of the virtual memory space for a process as a skyscraper , where , the High Address are at the ceiling , the low address are the floor, and the stack is sort of starts from the ceiling -> floor and the Heap Grows up (from floor -> ceiling) BTW heap is just a memory space that is not fixed (not limited , theoretically) unlike stack which is around 8 Mb in the Linux and windows, we will discuss on the Heap memory later, for now let's try to see the stack through and through alright!

try to imagine this

scss
main()
  └── greet()

by the time we reach greet() the stack looks like this ,

css
%%---higher_address---%%

[greet return address → back to main]
[greet bookkeeping]
[greet locals]

[ main return address → runtime ] ← where program goes after main
[ main bookkeeping ]
[ is_admin ]
[ name[8] ]                       ← your buffer
%% ---lower_address--- %%

you can see that the buffer lives at the bottom of the Block , i've added one more function to sort of drive home the point that function call are stacked in memory,

now , when you type input bytes go into name , and when you type more then 8 bytes they don't stop , they just keep going, into the next slots, guess what's there in next slot ,

css
name → is_admin → bookkeeping → return address

so this is how we overwrite , and if you really overthink about this like i do , you will reach upon the conclusion that ,

A buffer overflow is not “extra input” but it is just writing into memory that belongs to something else.

When a function finishes, the program needs to know, where do I continue? So it looks at the return address stored on the stack and jumps there. If that address is correct , program continues normally. If that address was overwritten, program jumps to the wrong place, Imagine this as some one is just swapping the rug under your feet when you jump after you finish drinking water till a certain level in a tank that is above you, wow that was just a wired example, LOL :))

now hacker can sort of measure , the offset till the return address,

css
buffer size = 8
+ saved frame pointer = 8
------------------------
return address at byte 16


/*
In simple builds without protections, the return address often appears just above the saved frame pointer, though the exact offset varies depending on compiler, architecture, and security features.
*/

so the input they will enter will sort of look like this

css
[ padding bytes ]
[ new return address ]
[ attacker-controlled data ]

now some caveat, the string that a hacker will enter have to be, such a string, that when sort of read as bytes it act as machine instruction, now the catch is , that, for the CPU to successfully being executing, it has to check these 2 following condition

  1. execution is redirected to that memory address (e.g return address overwritten to point there)
  2. the memory region is executable,

if the program just take the input and prints it, then technically it can't do any harm ,
Now another catch , this can never happen even if you push methods like get() to prod, cuz modern systems often marks stack memory as non-executable (NX), so even if the control jumps there, CPU will block the execution. heights of technology right ! Now if hacker is reading this they might argue that , yeah NX prevents shell-code injection , but we can use return-to-libc or ROP chains, then i will say , yeah that True! , but those topic are beyond our naive lil scope of this article. being a lil meta here , aren't we ?


Doubts or Queries

this section is for doubts and misunderstanding or confusions i has when i was learning about these topics , which i figured to mention here to clear your model further more,

About Volatile :

okay so when compiler optimization are on , volatile will tell our compiler , to not sort look at the local block and pre assume that "wait I don't see any thing updating this var so I'm gonna assume that this is will never change and replace it with some thing concrete in compile time only" , it tell compiler , "dude , there Might be some bigger play here, that you don't understand so clam down" !!
And about the Buffer over flow , code snippet , we were not like updating the is_admin , legally, so compiler though this var will just be 0 all the time.

some volatile pattern

This is often called the "Const/Volatile Right-Left Rule": Read it backward (from right to left) to understand what is happening.

Code Read Backwards What it means
volatile int *ptr; "Pointer to an int that is volatile" The Data is Volatile.



The pointer stays fixed, but the value it points to (like a hardware timer) changes.
int * volatile ptr; "Volatile Pointer to an int" The Address is Volatile.



The value is stable, but the pointer itself might change (e.g., an interrupt swaps which buffer you are reading).
volatile int * volatile ptr; "Volatile Pointer to a Volatile int" Everything is Chaos.



Both the address AND the data can change at any millisecond. Rare, but real.

Pro Tip: This works exactly the same way as const. If you master this logic, you master const too.


what is bookkeeping :

Bookkeeping” is just the extra data a function stores on the stack so it can run and return correctly.

Besides your variables, a stack frame usually needs to remember things like:

  • the previous stack frame location
  • alignment / space reservations
  • saved register values (if the function needs them)
  • temporary storage the compiler uses

The most common piece is the saved frame pointer (a reference that helps the program find its way back to the previous frame).