Archive for the ‘Uncategorized’ Category

Hello World in LLVM

January 1, 2011

For some time now, I have been very interested in the compilation process of programming languages and how they are converted to assembly. Naturally, I became very interested in LLVM portable assembly syntax, and how its syntax compares to x86 assembly. It turns out that it is very readable and understandable, unlike the bare bones assembly found when disassembling a program. I found that it is actually a lot easier to analyze x86 assembly after it has been been generated from LLVM as intermediary since the LLVM block labels are commented into the assembly blocks and in general the structure looks very similar. Let us look at an example of a simple hello world program.


@0 = internal constant [20 x i8] c"Hello LLVM-C world!"

declare i32 @puts(i8*)

define void @sayHelloWorld() {
aName:
  %0 = call i32 @puts(i8* getelementptr inbounds ([20 x i8]* @0, i32 0, i32 0))
  ret void
}

Looking through the following example, the global @0 is set with a string. An external function where the body is defined somewhere else needs to be initialized with the declare keyword. A new function, however, needs to be defined along with its body, return type, and parameters. Each function body needs to have at least one label, which in this case is aName. After the block label, the call function is used which calls puts function from an external definition. The function getelementptr returns a pointer to an element specified with certain bounds. When the inbounds keyword is used, access is denied outside of the bounds specified. The register %0 is set with the result of external puts function, and a void is returned. Now I will present the equivalent x86 assembly.


        .section        __TEXT,__text,regular,pure_instructions
        .globl  _sayHelloWorld
        .align  4, 0x90
_sayHelloWorld:                         ## @sayHelloWorld
Leh_func_begin0:
## BB#0:                                ## %aName
        subq    $8, %rsp
Ltmp0:
        leaq    ___unnamed_1(%rip), %rdi
        callq   _puts
        addq    $8, %rsp
        ret
Leh_func_end0:

        .section        __TEXT,__cstring,cstring_literals
        .align  4                       ## @0
___unnamed_1:
        .asciz   "Hello LLVM-C world!"

Let us look at exactly what is going on. The stack pointer is advanced so that we may put local variables onto the stack. The string we would like to output is saved into %rdi register and _puts is called with the parameter. The stack pointer is returned to what it was originally and we return from the function.I think my favorite part about the following code is that LLVM has left us breadcrumbs which can give us insight into the interoperability of the x86 assembly. Particularly, we can see that @sayHelloWorld, %aName, and @0 entry block entry points are provided for us! Even though the following code snippet is not complex, for code blocks of greater complexity this information might be very important for us.

For reference, here is the C code necessary to generate and run the LLVM Hello World example.

what does “P == NP” mean?

December 10, 2010

A solution to a given algorithmic problem may be given in terms of a yes/no answer. NP is defined as polynomial running time necessary to verify that a solution is correct. P is defined as the actual running time to find whether there is a solution, which is polynomial. Therefore, it can easily be seen that something which runs in polynomial time could be easily verified in polynomial time as well, given an instance of the solution, so P is a subset of NP. However, does this mean that P == NP? There might be other algorithms that have different running time magnitudes, however which might be able to be verified in polynomial time.

Now, what does it mean for a problem to be NP-complete? It is a type of problem that could be verified in polynomial time, however there is no known algorithm that could solve the problem in polynomial time. There may be an algorithm, however it has never been discovered. If there is such an algorithm for at least one of these NP-complete problems, this means that there is an algorithm for all of them, and thus in fact all NP-complete problems are in the class P as well.

Therefore, if we know a problem that is already NP-Complete, we can prove another problem being NP-Complete as well. What this means is that if we know the running time of some problem that is NP-Complete, we know the running time of the other problem is at least as fast. So if there is a language L and we know language L’ is NP-Complete, we can reduce all inputs of of L’ to inputs of L such that when we feed input into L’, we have a transformation function f which reduces input instance x of L’ into input instance f(x) of L. From this, if the problem L gives a yes/no solution to input f(x), we know as well that L’ will give us the same answer based on input x. The five steps we must perform are

1. Prove L is an element of NP
2. Select a known NP-complete language L’
3. Describe an algorithm which computes a function f that maps every input instance x of L’ into an input instance f(x) of L
4. Prove that the input x is an element of L’ if and only if f(x) is an element of L
5. Prove that the algorithm necessary to generate f runs in polynomial time