Thursday, September 24, 2015

Day 1 - Diving Deep


The key to understanding any language in depth is to find out what its compiler does. 


Some key questions from my understanding so far be like:

  1. What part of the compiled program is stored in dynamic memories like the stack and heap? [answered below]
    • How are the rest of the program laid out in RAM? [answered below]
    • I mean, what is the significance of BSS, CS, ES etc? [answered below]
  2. What is the language paradigm and how did it affect the compiler construction? 
    • What are the primitives or are there any primitives at all(i know there will always be primitives but are they available to the developer - like Scala) ?
    • Are function objects? 
      • yes they are. however its a little complicated to visualise this in the head. Lets try to demystify (be warned that the below is just my assumption)
      • during runtime, the function declarations are placed as objects on heap (it may not be on heap but heap is garbage collected and is probably the best candidate)
      • when a function is called in the code, the program counter is passed a location on heap to resume execution there. (unnecessary point : Now this is both strange and insecure since heap is generally not used for storing code). This location would be the starting address of the function body.
    • What are the various collection types available in the language?
  3. Is there automatic garbage collection?
    • Garbage collection is never really multi-threader unless of course you are manually tagging the resources to be garbage collected. In this case the accessing and cleaning of the resource is thread safe. 
    • In JS I assume there is a function that is automatically called during the process.nextTick such that it will clean the heap


My general doubts: 

  1. How is an error thrown and handled? with interrupts, is it?
    • in C++ for e.g., there is a static table created with location of functions mapped to their handler locations. This is produced during the compile time. 
    • when an interrupt is raised during a runtime error, the processor halts the program execution and searches for the appropriate handler in the table and runs it
    • in languages such as Ada and Python, this error handler location is copied to the stack when the corresponding function is being executed. In case of an error, the processor will then look for the error handlers on the stack and execute it, passing the control back to the program.
    • The C++ is more efficient although it eats up some amount of disk space for storing the table. Counter point is in case of a no-error runtime, the memory footprint of Ada kind of languages are better. But at times of error, C++ is monstrously fast.
  2. What does type safety do in general?
    • It guarantees more speed and security if type ensured at compile.
    • Speed is greater because the labelling of the data type is done during the compile time and in case of an illegal operation during the runtime, it takes much less time to throw the error since there is no type confusion. 
    • However in the case of dynamic typing, the type is determined only during runtime and this loses time while checking and throwing errors impacting performance
  3. How does an uninitialised variable differ from an initialised variable in its placement in memory?
    • When a program gets compiled from the source, it produces an executable that has the following 5 or more segments.
    • Text = is where the program counter (or the instruction pointer) reaches out for instructions to be executed. This segment will have reference to other segments for the processor to find data.
    • Data = is where the constants reside
    • BSS = where the global and static variables reside. These are initialised to 0 by default if they are not initialised, however its not reliable to depend on these initialisation for program execution.
    • Stack = where the local variables will reside. This segment will be utilised by the processor as per the situation to store local variables, temporary values as well as the next instruction locations in the code segment. The local variables don’t get initialised until they are explicitly done in code.
    • Heap = where the ‘malloc’-ed or dynamically created objects reside.


Yet another important point is that Javascript is not interpreted, like we would all have thought. Its compiled. It compiles just before it runs. This implies the following:

  1. Since JS compiles to machine code, every source must therefore be added with the JS engine source to make it a standalone, compiled machine executable. 
    • This makes sense since now its more clear on how different browser tabs have their own instance that don’t interfere with each other.
  2. The console and the interpreter is part of the JS engine source. 
  3. Remember JS is single threaded. 
    • this implies that the js source is executed in the same thread that JS engine source is run and so is the interpreter and console(which by the way are part of JS engine anyway)
    • so if you write a blocking code in the interpreter or in the page or if the JS engine source is flawed, it would block the page (try and write a big for loop on the page or in the interpreter)
    • so within a single loop of the thread, there occurs many events such as source interpretation, event monitoring, display update, garbage collection, console interpretation etc not necessarily in that order, among many other events. 


While this is not everything about the language or the compiler, this bit of research has certainly shed some light on what’s been happening under covers. Day 2 is going to be more exciting !

No comments:

Post a Comment