Saturday, April 27, 2013

Reading Notes: an operating system for multicore and clouds

Some of my reading notes for the paper "An operating system for multicore and clouds: mechanisms and implementation"

Things I liked and that were interesting

It was interesting to learn that in future manycore systems the number of cores will exceed the number of processes. In connection with this the authors say that it is necessary to do space multiplexing instead of time multiplexing. I think this is a very big shift in thinking about computing power, as one has to think about how to map cores to processes, instead of time slicing one CPU to achieve multi-tasking.
It was also interesting that the authors presented VMs as a limiting factor. They argue that VMs provide an additional layer of indirection that make it more difficult to have a global picture of all the resources.

Limitations and Problems I had

I think it was strange that the authors tried to come up with a solution for both multicore and cloud computing. I prefer the approach of Barrelfish, where the researchers only focus on multicore and were thus able to come up with a more thorough solution in my opinion.
I also felt that the topic of security and isolation was left out in the paper. They talk about the disadvantages of VMs and state that their system provides a uniform view of all global system resources. However, one of the main benefits of having VMs is that they provide protection, and it would have been helpful if the authors discuss how protection was ensured when there was only a single system image OS. For example, in case of a compromise of fos, is there some mechanisms to do damage control, or is the attacker inevitably able to access all resources of the whole cloud?

Thursday, April 25, 2013

Reading Notes: multikernel

Some of my reading notes for the paper "The Multikernel: a new OS architecture for scalable multicore systems".

Things I liked and that were interesting

I found it interesting that the authors decided to incoorperate ideas from distributed systems and networking into their OS. By regarding each CPU core as an independent unit and only using message passing, they said that they could exploit insights and algorithms from distributed systems.

I also found it interesting that they chose to make OS structure hardware neutral. At first this statement didn't make much sense to me, because one would think the OS is the most hardware dependant layer. Then they explained that by this they meant the design decision to separate OS structure as much as possible from the hardware specific parts (messaging transport system and device drivers). I think if one would really succed in making it as hardware neutral as possible, this would significantly facilitate the OS development process.

Wednesday, April 24, 2013

A Hello-World program in Unix V6 on the SIMH simulator

In my previous post I've talked about how to get Unix Version 6 to run in the SIMH PDP-11 simulator. Now that you have Unix V6 up and running, there is some nice hacking you can do. Here's the classic "Hello World" program in retro '70s style.

System setup

At this point you should have successfully started up and logged in as root into your Unix V6 system on the SIMH pdp11 simulator. For instructions on how to do that, read here. Once everything is correct, you should be greeted by the root prompt "#".

This where the times of ed

So in Unix V6 we're actually in a pre vi and emacs world. So the standard editor that was available at that time was ed (a so called line editor) which you will find even less intuitive to use than vim or emacs.

So here's how to write a Hello-World C program with ed and compile with cc:

Reading Notes: x86 Virtualization

Virtualization has gained a lot of attention in recent years. Here are some of my reading notes on the paper "A comparison of software and hardware techniques for x86 virtualization"

Things I liked and that were interesting
It was interesting to see that from this paper’s results software beats hardware. Usually the perception is that hardware is faster, but in this case of virtualization the software makes it possible to devise workarounds for the few difficult cases while being efficient on most normal cases. In the hardware approach however, the hardware has to be designed to handle all worst-case scenarios (for example throwing traps for all privileged state accesses) and is thus difficult to implement.
It was also interesting to see how virtualization was made possible on the x86 which was not originally designed for virtualization. The technical details on binary translation show that it must have been a very difficult engineering process for VMWare to make virtualization possible on the popular x86 platform.

Limitations and Problems I had
While the paper shows that software virtualization is still superior right now, once hardware manufacturers improve their virtualization support, binary translation might be in danger of becoming superfluous. When hardware naturally implements the complicated mechanisms related to mapping virtual resources (memory, devices, etc) to physical ones and handles them with sufficient efficiency and flexibility, there is no need to devise complicated software workarounds. In fact, in Intel and AMD’s new generation of CPU’s with virtualization support, they added support for MMU virtualization and from a recent evaluation paper by VMWare on can see that the second-generation CPUs already perform significantly better than the first generation.

Tuesday, April 23, 2013

Running Unix V6 in the SIMH PDP-11 simulator

!!! Attention: this article (and in fact my whole blog) have moved to site: !!!

Just to live out my inner geek, I was experimenting with getting antique Unix versions to run in a simulator recently (admittedly this was also for a class project). I was using the simulators from the SIMH project and got the famous Unix Version 6 running on a PDP 11 simulator.

Related Articles:
Unix in a Nutshell
Prepping for Coding Interviews - Part I
Reading Notes: Ethernet paper by Metcalfe and Boggs

Monday, April 22, 2013

Reading Notes: Data Center TCP (DCTCP)

!!! Attention: this article (and in fact my whole blog) have moved to site: !!!

Another interesting paper I read proposing a new version of TCP specifically targeted towards data center's needs, called Data Center TCP (DCTCP). Below are just some of my reading notes.

Things I liked and that were interesting
I liked about the implementation that it was only very few changes in code and only one parameter to adjust. This makes it more convenient for data centers to try out DCTCP. I also found that the authors were very detailed in analyzing the performance of DCTCP, as they designed and ran many experiments to show the superiority of DCTCP. I also liked that the authors were very clear in stating under which conditions DCTCP offers advantage and where it doesn’t, for example by saying that they “make no claims that DCTCP is fair to TCP”

Limitations and Problems I had
I found it very hard to discover limitations in this paper, since I feel the authors have put very much thought into the paper and it is published in SIGCOMM. One issue the authors mention is the problem of synchronization between flows that is caused by the “on-off” style marking of the packets. Another potential point could be the isolation between TCP and DCTCP. The authors mention say that in data centers TCP and DCTCP flows can easily be separated, since load balancers and application proxies separate internal and external traffic. It could be possible that in an actual data center environment problems arise, when TCP and DCTCP packets cannot be properly distinguished (should one issue a specific flag?) or need to be converted to one another.

UDP header explained

A UDP header is used when sending data via the User Datagram Protocol (UDP) in the Transport Layer, as opposed to sending via TCP (Transmission Control Protocol). UDP is connectionless, doesn't guarantee reliable data transfer and has a smaller overhead than TCP. The UDP header is only 8 bytes long (as compared to 20 bytes for a TCP header). Below is the header structure:

UDP header

  • Source Port: specifies the sender's port where replies should be sent to. Set to zero if not used.
  • Destination Port: port number of the receiver. While the source port is optional, this one is required.
  • Length: length of the entire datagram in bytes, namely 8 bytes for the header + size of the payload. Note that potential outer headers, like IP header should not be counted towards it. Thus here the datagram refers to the UDP datagram
  • Checksum: a checksum is calculated over the header plus the data. If no checksum is generated, the checksum field should be all zero. However, it's recommended to use a checksum, and IPv6 in fact requires it. The checksum is calculated via a specific algorithm over the header + data with the checksum field initially set to zero.