About Me

Tuesday, June 25, 2013

IPC device using Linux Kernel "completions"

The Linux kernel provides a service named "completions" for  the purpose of intimating another process that a particular task has been completed. Here is the source code for a device driver for a simple Inter Process Communication device that can be used to send a string of 4 characters between two processes.


Here is some details about the "completions":

  • Completions is a technique that can be used to convey that particular task has been completed to the other process
  • <linux/completion.h> must be included to use this.
  • A completion can be declared by DECLARE_COMPLETION(my_completion)
  • If it has to be created and initialized dynamically then we may use the following method.           
struct completion my_completion;
  • Waiting can be done by wait_for_completion(struct completion *c)
  • It performs an uninterruptible wait and if nobody ever issues a complete() , then that results in an un-killable process.
  • Completion can be notified by
    • void complete(struct completion *c)
    • void complete_all(struct completion *c)
    • Both of the above functions behaves differently if more than one process is waiting
    • In that case complete() will only notify only one process. But the complete_all() will notify all the processes waiting.
    • The structure can be reused without any problem unless complete_all is used. If comlpete_all is used then re-initialization has to be done in the follwing way :- INIT_COMPLETION(struct completion c)

The Device

The structure of device is simple. Here is the write operation
The write operation simply copies the data from user space to an array in the kernel space.Then it issues a complete(). This is anything but a declaration that a particular task that is associated with sig has been completed and a process waiting on sig can proceed.
The read operaion is as follows.
The read operation on the other hand waits on the sig using wait_for_completion(). So when another process issues a complete() on sig, then the reader can proceed.
The device has the following behavior
  • Read and write can only be done with data size as 4 characters.
  • The data written to the device persists unless a read operation is done on the the device. The write operations is non-blocking.
  • The read operation on the device has to wait unless something is written to the device. Hence read operation is blocking.
Reference : LDD 3e Chapter 5

Saturday, May 11, 2013

HIGMEM and Memory Zones from newbie perspective


Earlier Linux kernels(about 10 years ago) where unable to support more than 1GB of  physical memory in 32 bit x86 systems. This was attributed to the way in which the VM was arranged. In earlier Linux 32bit systems with 3:1 split, the first 3GB of the VM was the user address space and the remaining 1GB was kernel address space. The 1GB allocated to the kernel can be mapped to  any part of physical memory. If the Kernel has to access any pages in physical memory then that has to be mapped to the its address space. Earlier kernel used static mapping and hence with 1GB address space, the kernel can only map 1GB of physical memory. Thus any physical memory beyond the size of 1GB can never be accessed by the kernel. Thus the earlier Linux kernels where restricted to using 1GB of physical memory in 32bit x86 systems.

The solution that was devised to solve this problem was high memory or simply HIGHMEM. In this strategy we divide the physical memory into high memory and low memory. The low memory still uses the permanent mapping. That is certain pages in the physical memory is always mapped to the kernel address space and that is called the low memory. The remaining part of kernel address space makes use of temporary mappings and that is called the high memory. That is the pages are mapped as when required. This enables the kernel to access any pages within 4GB range. The kernel data structures like linked lists must live in the low memory itself. This helps in maintaining pointer consistency. 


In Linux kernel the page frames are grouped logically in to three.They are
There are some reasons in the Linux kernel to divide the physical memory as explained next.In some architectures there is a constraint on which a range of memory can be used. An example can be given in x86 architecture. In x86 architecture , the ISA devices are only capable of addressing the first 16MB of RAM. It implies that the for DMA operations, the ISA devices can only use the first 16MB of RAM. Some architectures does not have this constraint(Example :- PPC ). Besides this we have to accommodate the HIGHMEM solutions. With the division of memory, managing the memory becomes easy with each zone having a struct to store its details.
Above I mentioned that ISA devices in x86 has a constraint in memory addressing. So in x86 architecture ZONE_DMA is the group of pages that belongs to first 16MB of RAM. Since PPC does not have this constraint and hence in PPC ZONE_DMA is empty.

In x86 architecture , the first 896MB of page frames are permanently mapped to the Kernel Address space. Which can be other wise stated as , the first 896MB of Kernel address space is mapped to the first 896MB of physical memory.So that leaves 128MB of unmapped addresses in the 1GB kernel address space. This unmapped addresses can be used to make the temporary mappings from the high memory. The union of ZONE_DMA pages and ZONE_NORMAL pages gives us the low memory pages. If ZONE_NORMAL pages are not available for allocation, as a last resort, kernel can use ZONE_DMA but not vice versa.

ZONE_HIGHMEM is the collection of high memory pages. So in 32bit x86 that will be any memory above 896MB. They are not permanently mapped to the kernel address space as explained in HIGHMEM section.

Sources :
[1] Linux Device Drivers,3rd ed.,Corbet et al
[2]Linux Kernel Development by Robert Love 
[3] http://lwn.net/Articles/75174/

Wednesday, July 18, 2012

Need for Shell Built ins

Usually a command is executed by the shell by spawning a new process. Though this strategy is effective in dealing usual commands it may turn ineffective in dealing certain others, especially one that changes the behavior of the shell itself.Example for such a command is the cd command. As you know, cd is a command used to change the current working directory in a shell.

Consider that we decided to adopt the conventional method for implementing the cd command,i.e we will spawn a new process whenever the cd command is encountered. If you where to explore Linux System Call API, you will find that the only sys call available for changing the directory is the chdir(). The problem with the chdir() is that it can only change the current working directory of the current process. Thus if we are implementing cd as a separate process, it will have no effect on our main process-shell.That's the stage at which the shell built ins kicks in. Built ins are nothing but commands whose implementation is the task of the shell itself. So whenever a built in command is encountered , it is implemented by the bash itself rather than as a separate process. 

The below program is the source code for a minimal shell that implements cd as a built in
Note: The minimal shell given above accepts maximum of only one argument for any given command including switches. Also the commands supported are the ones only in /bin directory along with exit and cd

Monday, June 04, 2012

Kruskal's Algorithm, Simplified

Note : This article is intended to the readers  who can implement Kruskal's algorithm manually at the least.

Kruskal's algorithm is an algorithm for building Minimum Spanning Tree (MST) from an undirected graph.  The algorithm can be summarized as "Go on building the MST by adding the edges in their ascending order of their cost such that they do not create cycles".This can be explained in steps as:-
1. Identify next minimum edge
2. See whether that edge creates a cycle,if added to the MST
3. If it doesn't, add it to MST ,else discard it and repeat step 1.
4. Go to step 1 if MST is not complete.

Consider that we are using the Adjacency Matrix for  representing  the graph. Then Step 1 is the easiest to carry out and Step 2 is the hardest. It is the Step 2 that I want to discuss in detail. Step 1 can be carried out simply by constructing nested loops. At the end of step 1, we have the next minimum cost and the name of the nodes connected by the minimum cost edge.

Now comes the trickiest part. Given an edge,how will we find whether that edge creates a cycle or not when added to the MST?This requires a clearer understanding of what cycles is.

Identifying Cycles

Cycles occurs when there are multiple paths from any node to any other node. So before adding an edge (x,y) to the MST, we have to see whether there is already a path that exists between nodes x and y. A path between nodes x and y can take  number of forms. Examples are:-
1. (x,z) (z,y)
    x is connected to z and z is in turn  connected to y,thus creating a path between x and y
2. (x,z) (z,a)(a,y)
    This can be similarly be explained as  example 1.

Explanation with an Example

Consider the below graph
The adjacency matrix representation of the above graph is given as follows:-
Note: The blank spaces can be regarded as infinity which denotes that the corresponding nodes are not related.Also the above matrix is a symmetric matrix.

Now we start applying the Kruskal's algorithm manually.
The first minimum cost edge is (1,3) or (3,1). We will add it to MST since it won't create any cycles for obvious reasons. Similarly by manual evaluation (4,6) , (2,5) and (3,6) are added to the MST. Now our the adjacency Matrix of the graph would look like as below:-
Note: For each edge added to the MST, the corresponding column is made blank(or infinity)

Now our MST graph would look like as below:-
The adjacency matrix for the same is given by:-
The next minimum cost edge is (1,4) with a cost of 5. But by manual evaluation we know that this edge will result in cyclic graph when added to MST. Now we will see how to infer whether an edge will result in cyclic property, from the adjacency matrix of the MST. Now we need to see whether there exists a path between 1 and 4.We follow the below steps:-
A.Take the columns of the first row. Name them as (1,k) where k=1 to n (n is the number of nodes). 
B.Select the non blank columns among (1,k)and name them as(1,nb)  . If any of the nb=4,then we can straight forward say that there is a path between 1 and 4.If non among (1,nb) has nb=4,then we will see whether there exists, a path between (nb,4) by applying same procedure as Step A.Thus this goes on recursively.
Based on this we develop a noPath() procedure,which will take the nodes,and MST matrix as arguments and returns 1 if there is no path between them else it will return 0.

procedure noPath(int node1,int node2,int nodePrev,int n,int MST[][])


         int k,ret=1;

        for k in range 1 to n do




                       if k equal node2



                      else if k not equal node1 and k not equal to nodePrev 

       return ret
end {noPath}

Note:The variable nodePrev has been used to avoid the reverse direction flow. For example
after having a search through edge (1,6),we do not want to turn back and search (6,1).

With the above algorithm I hope I have covered the most important aspect of Kruskal algorithm .You may see the real C program implementing Kruskal algo here