Title: Vyper's memory allocator - a deep dive Description: A deep dive into how the Vyper compiler models and maintains memory Keywords: No keywords Text content: Vyper's memory allocator - a deep dive Pythonic language for the EVMVyper BlogPostsCategoriesVyper's memory allocator - a deep dive27/03/2025 Deep-diveEstimated reading time: 9 minutesAuthors: cyberthirstThis article introduces how the Vyper compiler models and maintains memory (as in the EVM memory location). It explains the memory layout of Vyper functions, how allocation and deallocation of variables take place, and how the calling convention interleaves with memory allocation.It can help developers understand how to structure their contracts to save gas and how to prevent certain DoS scenarios related to allocation of DynArrays. Further, it’s useful material for anyone interested in studying the Vyper compiler - the text contains many references to the Vyper codebase.General introductionThe compiler maintains a context, which is used for purposes like variable management, memory allocation, scope handling, or constancy tracking. It’s initialized with a reference to a memory allocator. The allocator allocates/deallocates memory, checks bounds, and enforces alignment. The context, amongst other things, has an API for creating/deleting variables, and under the hood these actions interface with allocation and deallocation.The current release (v0.4.1) doesn’t perform sophisticated algorithms for memory analysis by default (although --experimental-codegen offers increasingly more optimizations), as such it doesn’t deallocate memory immediately when a variable is no longer live (but scope-based deallocation is performed) nor does it perform e.g. aliasing.Allocation & deallocation at the level of EVMBefore we analyze specific allocation strategies and scenarios, let’s take a look at how allocation actually looks like in the EVM.The Vyper memory allocator abstractly models the EVM’s memory. It assigns each variable a certain memory range (note that compared to Solidity it defaults to memory instead of stack). That means that the variable is represented as a pointer to an address where it starts, and then the allocator guarantees that the next variable will be allocated at free_mem_ptr (not the same concept as in Solidity). So, at the EVM level variables are represented as numbers.Consider this simple example:if True: a: uint256 = 1 The allocator assigns the variable a an address, say 200. For this address the compiler will generate an mstore 200 1 to fulfill the assignment. To load this variable, we’ll do mload 200. EVM doesn’t have the concept of allocation - it will automatically expand the memory based on which pointer we’re accessing. Note that in the EVM, the concept of deallocation is non-existent, EVM memory can’t shrink.Context and Memory allocatorWhen the compiler needs to create a new variable, it does so through the Context class. The allocation function checks how many bytes the variable’s type requires and will reserve the memory range through the memory allocator class.For example, to allocate array, the compiler will reserve 352 bytes (32 for the len + 10*32):def foo(): array: DynArray[uint256, 10] = [1, 2] Different types of variables have different scopes (see below). After a variable’s scope finishes, the variable is deallocated. Deallocation means that the memory allocator will stop reserving the variable’s memory range and that the memory range can be reused.How are variables allocated?Variables are allocated in memory - this is also why Vyper doesn’t suffer from stack-too-deep errors. We take the type of the variable and retrieve the maximum memory bytes required and allocate this number of bytes. This takes place only in the abstract memory allocator in the compiler. To actually allocate the memory in EVM, we have to physically touch the corresponding address. Thus, if we never touch the addresses at runtime, the variable remains allocated only abstractly.What might be a surprising side-effect of allocating the maximum memory bytes for each type is that for DynArrays[typ, size] we will always allocate as many bytes to accommodate the “worst case”, i.e., the case when the runtime size matches the size in the type definition.ScopingThe following paragraphs discuss the scoping rules of Vyper:Block scopeBlock-scoped (if, else, for) locals are deallocated at the end of the block scope.if True: a: uint256 = 0 b: uint256 = 1 c: uint256 = a + b # <--- dealloc a,b, c else: pass Internal variable scopeEach statement has its own scope for internal variable management. Internal variables are those that the compiler generates under the hood; they are an implementation detail. For example, the slice builtin allocates an internal variable inside which the result of the slicing will be stored.def foo(s: String[32]) -> String[5]: # 0. allocate buffer for `s` # 1. allocate internal buffer for result of slice(s, 4, 5) # 2. assign the internal buffer to `s` # 3. after the assign statement is compiled, release the internal buffer (but leave s intact) s: uint256 = slice(s, 4, 5) return s As can be seen, there is room for optimization - the internal variable isn’t strictly necessary and the allocation is redundant (in the example we could store the result of slice directly to s). Eliding memory is an area of further optimizations to be added in the Venom pipeline.How are function-level locals deallocated?They aren’t; each function is statically allocated and its function frame is kept during the whole message call.How allocator handles multiple functionsWhen compiling a function (both external and internal) a new Context is created. The context is also initialized with a MemoryAllocator. This has a special twist - the memory allocator is initialized with an address at which it starts allocation (e.g. it might be from the address 500 onwards). The following few paragraphs will answer the question: How to get the address with which the memory allocator is initialized?Stack framesA function stack frame is a data structure used by a contract (and created by the compiler) to keep track of information during function execution. In traditional languages, stack frames are often allocated dynamically with function calls (the compiler manages the stack pointer SP). In Vyper they aren’t; they are static - this is allowed by the fact that Vyper doesn’t allow recursion. Recursion would require dynamically creating stack frames for the function calls, because the actual call graph might be dependent on runtime values, i.e. it’s not computable during compile time.In Vyper’s context a stack frame is just a static memory range (a static bytes buffer) which the function uses for its memory operations.Allocation of stack framesAs the allocations and deallocations are performed, the memory allocator notes the highest memory pointer allocated so far. So even though after a deallocation the pointer decreases, we still need to allocate the maximal value to accommodate for even the previous allocations. This max value is the size of the function’s stack frame.A contract can have multiple functions - so how are individual stack frames mapped to memory? Function calls in Vyper can’t form cycles. A call chain always starts with an external function which can potentially call other internal functions.To allocate the memory we take a leaf of the call tree and allocate memory for it. Then we remove the leaf and recursively proceed with the allocation .. and so on until we hit the initial external function. Thus, the external function will use the highest memory addresses.So to answer the original question on how to initialize the memory allocator - we take the sizes of the callee frames and compute the max:# calculate starting frame callees = func_t.called_functions # we start our function frame from the largest callee frame max_callee_frame_size = 0 for c_func_t in callees: frame_info = c_func_t._ir_info.frame_info max_callee_frame_size = max(max_callee_frame_size, frame_info.frame_size) Where is the frame size set? After we finish compiling a function (both internal and external), we call tag_frame_info. The function consults the memory allocator for the current memory size and sets it as the frame size. A bit confusingly, the frame size isn’t the memory_size-frame_start, but includes the addresses up to frame_start too.How are the function arguments allocated?Firstly, let’s discuss external function arguments. They initially lie in calldata. For each argument, we calculate the calldata pointer to where it starts (different parts of calldata correspond to different arguments). If the argument needs validation, we create a new internal variable and copy into it the validated version of the argument. Otherwise, the variable is represented as a calldata pointer.Which types don’t need validation? Those whose encoding matches Vyper’s internal encoding - static arrays / tuples with elements that don’t need validation, (u)int256 (the value can’t be too big/small - the 32B are always a valid value for this type) and others (see the needs_clamp routine for a full list).How are return buffers allocated?For internal functions, the callers allocate the return buffers. The caller is represented by the Call expression. The return buffer is represented by an internal variable (internal as in “implementation” detail) which is allocated when parsing the Call expression. The address of the return buffer is then passed to the callee, who fills it when executing the return statement.How are arguments passed during function calls?Vyper’s calling convention is pass-by-value meaning that all arguments are copied. When allocating memory for a function, all the parameters are allocated at the beginning of the function’s function frame. Therefore, during a call, the arguments are copied to this preallocated location. The IR for the copy is created by the make_setter compiler routine.A few notes on assignmentsThe frontend of the compiler issues a copy for the assignments. The copy in turn means that there must be a target buffer for the copy (which in turn requires an allocation), and the size of the target buffer is always parametrized by the max possible size of the given type. The optimizers, especially Venom, are capable of eliminating some of these copies.For example, the following contract will force allocation of buffers (in the compiler’s frontend) for both a and b:def foo() -> DynArray[uint256, 1000]: a: DynArray[uint256, 1000] = [1, 2, 3] b: DynArray[uint256, 1000] = a return b It’s the optimizer’s job to optimize those away.How are dynamic arrays allocated?For local variables, the full potential size of the array is allocated. For the following array array 32+1000*32 bytes will be reserved in memory:def bar(): array: DynArray[uint256, 1000] = [] If the array comes from calldata (i.e. the variable is an argument of an external function), then it is immediately copied to memory and we no longer work with the calldata pointer but with the internal memory variable. This is convenient because it makes it independent of the abi encoding which is variable, and allows us to work with a static pointer.What does dynamic stand for then? The length of the array is dynamic (but bounded by the size declared in the type), however the dynamically-sized array lies in a statically-allocated memory buffer (which is always big enough to contain the max length).Closing thoughtsWe showed how Vyper manages its function frames - how they are constructed and allocated. Local variable allocation was also discussed in detail - we showed how variables are represented and how their liveness is defined. We also discussed how DynArray allocations take place.What wasn’t covered is how the memory operations are further optimized in the later stages of the compiler. We focused on outlining how the frontend of the compiler generates the IR, but this IR is further optimized and many of the inefficiencies are removed later.Interesting topics to cover further are concepts like memory aliasing (more generally the pointer analysis), promoting variables from memory to stack, or how memory operations can be fused together. Some of these are already implemented in the Venom pipeline, others are planned.Vyper Compiler Internals© 2025 Vyper Blog | Lowkey themeRSSGitHubvyperlang.org