Garbage Collection with LLVM

We are researching typesafe garbage collection, for memory safe capability-based programing languages, apparently we need accurate garbage collection. Goals ===== The primitive support built into the LLVM IR is sufficient to support a broad class of garbage collected languages including Scheme, ML, Java, C#, Perl, Python, Lua, Ruby, other scripting languages, and more.

> Garbage collection is a widely used technique that frees the programmer from having to know the lifetimes of heap objects, making software easier to produce and maintain. Many programming languages rely on garbage collection for automatic memory management. There are two primary forms of garbage collection: conservative and accurate

> Conservative garbage collection often does not require any special support from either the language or the compiler: it can handle non-type-safe programming languages (such as C/C++) and does not require any special information from the compiler. The `Boehm collector is an example of a state-of-the-art conservative collector page

Using a conservative garbage collector to detect memory leaks page

> Accurate garbage collection requires the ability to identify all pointers in the program at run-time (which requires that the source-language be type-safe in most cases). Identifying pointers at run-time requires compiler support to locate all places that hold live pointer variables at run-time, including the :ref:`processor stack and registers <gcroot>`.

After discovering a top in-memory database using LLVM Query Processing to make SQL safe we read further and plan to add notes getting a grip. page

Non-goals ========= > LLVM does not itself provide a garbage collector --- this should be part of your language's runtime library. LLVM provides a framework for compile time :ref:`code generation plugins <plugin>`. The role of these plugins is to generate code and data structures which conforms to the *binary interface* specified by the *runtime library*.

> This is similar to the relationship between LLVM and DWARF debugging info, for example. The difference primarily lies in the lack of an established standard in the domain of garbage collection --- thus the plugins.

Implementing a collector plugin =============================== User code specifies which GC code generation to use with the ``gc`` function attribute or, equivalently, with the ``setGC`` method of ``Function``. > To implement a GC plugin, it is necessary to subclass ``llvm::GCStrategy``,which can be accomplished in a few lines of boilerplate code. LLVM's infrastructure provides access to several important algorithms. For an uncontroversial collector, all that remains may be to compile LLVM's computed stack map to assembly code (using the binary representation expected by the runtime library). This can be accomplished in about 100 lines of code. This is not the appropriate place to implement a garbage collected heap or a garbage collector itself. That code should exist in the language's runtime library. The compiler plugin is responsible for generating code which conforms to the binary interface defined by library, most essentially the :ref:`stack map <stack-map>`.

Cool stuff ========== Apparently we need to implement a plugin, to do type safe stuff. We are interested in Dart, ES/JS and Rust. Maybe other cool stuff such as safe queries or other actions.

Also see LLVM the power behind Swift, Rust, Clang, HANA and more page .