Hakuna Matata: post1

Showing posts with label post1. Show all posts

Thursday, May 22, 2014

Software Defined Networking

One things many articles on SDN have in common is, the fact that they are littered with technical terms It is often simpler to glance through these terms, to ease the understanding of the concepts elucidated to in further text.

control plane: It is a logical plane where the execution decisions are taken at a higher level, like taking routing decisions. i.e. it is the brain of the network

data plane: It is the logical plane where actual execution is done at the lower lever, like forwarding packets based on forwarding table. i.e. where the actual function is carried out, is implemented.

mininet: It is a simulator for SDN networks. It is an upgrade over NetSim where we could simulate simple networks via GUI help. In mininet, we have openFlow and SDN support. Simulation environment is set up by writing code and not via GUI.

POX: It is a python framework for configuring a controller. In it every network entity, i.e. 'web server, router, switch' is an component. We create components to introduce entities in our network to be simulated.

OpenFlow: it is a very popular protocol used to communicate between a controlling server and a OpenFlow enabled switch. Using API of OpenFlow, we communicate between server which runs control plane and the actual switches which implement data plane. It is also used to communicate between switches.

dpctl: it is a utility by which we can control the flow aspects for a single switch. It is used for debugging and experimenting with the flow table of the switch. It comes along with openFlow.

note- Incidentally, there is a quick start guide wherein in a virtual box, you import a ovsk format virtual disk which has pre-installed mininet with pox. For more information, look here

Historically, the Data Plane and Control Plane are usually implemented within a single physical device, whether it be a router/ switch or any other layer 3 device. This meant that all network devices literally came in a box, with all their functionality built on a single physical machine. If someone had to change/tweak some functionality, he had to manually configure the specific machine.
However, in the modern era, it would certainly be advantageous if we could implement a little something called separation of concerns. This is achieved by physically separating the Data and Control planes.
This leads to a situation where the control plane controls several devices via a controller, and each device just implements the data plane.

This thinking is exactly what gave rise to the concept of Software Defined Networking, a move to design networks which are dynamic, flexible, more secure and can be controlled centrally. Notionally although this is a fairly new term, in essence its principal have already been applied via various previous protocols.
Even now many people are not sure exactly what SDN entails , including yours truly, but it understanding its functionality separately and combining the pieces bottom up would surely give us a decent idea.

a introductory link, which illucidate the manner in which SDN are used, as well as its underlying benifits is this.

However using a single controller is unfeasible with large number of hosts. Hence we come up with multiple controllers, each connected to separate set of hosts, but behaving as one big logical controller.

Saturday, February 1, 2014

Algorithm tricks 1

TWO POINTER ALGORITHM

used to find elements of a array which sum up to some desired value.

Suppose you have 2 arrays A and B and you need to find if there exists some i and j in A and B respectively such that A[i] + B[j] = val, where val is some value given to you.

We make use of 2 pointers p1 and p2. Firstly, sort the 2 arrays using a suitable sort algorithm. This takes O(n log n) time each.

Initially p1 points at index 0 of array A and pointer p2 points at index n-1 of array B. Now we run the following inside a while loop

1.calculate A[p1]+B[p2]

2.If this value is greater than val then we do p2--.

3.If this value is smaller than val then we do p1++.

4.Else if this value is equal to val then p1 and p2 point to the elements of the two arrays whose sum is equal to val and we can break the loop.

5.Exit condition of the loop is specified as when either p1 reaches end of array A or p2 reaches start of array B. If either of the above happens then we have exhausted our search space.

note that we have put the following restriction on p1 and p2. p1 can only move down an array , thus we can only increment it. p2 can only move up an array, i.e we can only decrement its value.

The above technique can be applied to the spoj problem - love story

Thursday, January 30, 2014

Firefly Algorithm FA

Firefly Algorithm belongs to a broader set of nature inspired algorithms, namely evolutionary algorithms. In this case, the behavior of fireflies is our source of inspiration. This particular algorithm is credited to Xin she yang, who has also worked with various other evolutionary algorithms.

FA is a meta-heuristic algorithm, and although it doesn't give us the optimal solution, it does give a solution which is close enough approximation to the optimal solution.

It is suitable for those situations where the information is unkown or is only partially known. Even though it is still a relatively new algorithm, it is already showing to be a favorable choice to handle NP hard problems

The basic principle on which this algorithm runs is that all fireflies glow, and that the glow of a firefly is a measure of its attractiveness. Thus a firefly which perceives a more brightly glowing firefly nearby will be attracted towards it.

To get started on the algorithm, we first need to go through some terminology

1 A Population set X, where each member is characterized by its position.
2 An Objective Function f(x) to measure the glow at position x_{i .}
The concepts involved are :

1. Attractiveness
Let us take a firefly at location i and examine how it perceives a firefly at location j. Now the attractiveness(brightness of light) emitted by firefly i at its own location is obtained by the value of objective function at i, that is I_i=f(i).
Also the firefly j also emits light, whose value at location j is given by I_j=f(j). Now its perceived brightness at location i is somewhat dampened by the absorbing nature of the medium in which the light travels (gamma). The exact formula is ,
Io* e^(-gamma*d²)[ note - a faster variant would be to use the formuls  Io/(1 +gamma*d²) ]
where,
I_o= I_j
d = Euclidean distance between i and j
Now if this perceived brightness is more than I_i that means that for i, j appears more attractive and hence it should move towards it.

2. Movement
When we decide that firefly j is brighter than firefly i, then we move firefly i. The formula for calculating the new position of firefly i is given by
insert formula
where
the first term is the initial position
the second term is the direct move towards the brighter firefly j
the third move is the random step in any direction.

The pseudo code of basic FA is :

pop=random_populate(n); // initialize a random population X of size n and store in pop
f(x);   // define a objective function f (x)

while(t<max_iterations) //loop counter to keep an upper limit on number of iterations
for i = 1....n
for j = 1...n   // double loop for each pair of fireflies
if ( effect of light source at location j i.e f (pop.j) at location i > f (pop.i) ) // if firefly j is brighter than firefly i
  move(pop.i); // move toward firefly j

end for j;
end for i;
t++;
//rank the fireflies according to the objective function and find the current best
end while;

keeping value of gamma as 0 means that the medium does not absorb any light, and that intensity due to firefly x at location i is same as that observed at j. for any i and j.Similarly keeping value of gamma as a very large number means that the medium is very dense. In that case the FA reduces to a random search in the search space.

keep value of alpha as a small fraction of the scale of dimension values we are dealing with(for a particular dimension), ideally 0.1 of the scale of the dimension.This helps in controlling the randomness a little while also not totally neglecting its contribution to the movement of the fireflies.

In practice it has been found out that firefly algorithm is generally superior to Genetic Algorithms(GA) or Particle Swarm Optimization algorithm(PSO).In fact, FA can find the global optima as well as all the local optima simultaneously in a very effective manner.Hence it is a multi-modal optimization algorithm.

It is possible to improve the solution quality by reducing the randomness gradually(in the movement step). We could vary the randomization parameter(alpha) so that it decreases gradually as the optima are approaching(as the number of iterations increases).

It is still a young algorithm, and more and more areas of its application are still coming up.For example, it can be used to tune parameters of a Neural Network, SVM and so forth. It is developing to be an exciting field of research.

Wednesday, November 6, 2013

Face Recognition -post 1

Introduction

Face Recognition, or the ability to recognize faces, is an active field of research in computer vision.
Although 3D Face Recognition is the hot topic in this field, we will first illustrate 2D template matching based Face Recognition. Concurrently major research is going on in two aspects of 2D face recognition
1. The feature extraction method which extracts information from faces.
2. The machine learning mode which recognizes faces.

However, the problem is that to build any viable application, there are other aspects which must be dealt with. We will be addressing various challenges faced by application builders and provide feasible solutions for the same.

Firstly, Pre-processing forms the crux of any real -world face-Rec application. Input maybe taken from varying sources, the lighting may be inadequate,or the image may be too bright. Moreover, the face may be rotated. For a thorough overview as well as techniques of preprocessing, see here.

After preprocessing the face image, we need to extract features from it. Broadly, this done via Geomteric feature extraction algorithms and template matching algorithms. For a wider coverage on this topic, refer here.

These set of posts are aimed at building an amateur face recognition software. The implementation involves taking the help of Neural Networks and Gabor Wavelets. Gabor Wavelets are a means of extracting the features from the facial image.That is, from the input face photo it takes out the relevant featuers which we pass on to the neural networks .Neural Network is the classifier used, that is, the real brain of the software.

So the project can be divided into 2 sub parts. Firstly, fitting a mathematical model like Kernel Filter (where filter matrix is generated by Gabor wavelets) to the input image. Secondly, using a machine learning algorithm like neural networks to train the system and later use it for testing purposes.

Intelligence Achieved

Perception : The software processes an image containing a face, and recognizes it by matching it to an image in its database. It identifies faces in a closed universe,that is, it will always output a valid result (it will never display “match not found” ), the closest match it can find. It will be stable to partial occlusions due to shadows, hats, scarves, beards, etc. Our software is also robust against homogenous illumination.

Learning : The software will be trained using a existing data-set. It can at any time be configured to run in training mode where it further enhances its learning ability by improving its prediction parameters(weights of neural network). It will used this gained knowledge base to recognize a face.

Knowledge Representation

In the Neural Networks, knowledge is represented through the weights at the junctions between neurons in the neural network. The weights are altered by training the network or adding knowledge to the knowledge-base using gradient descent and other advanced algorithms such as backpropagation of errors.

Tuesday, October 8, 2013

Google Caja Post-1

Very recently I found out that javascript has a really bad design from the point of view of security.This is shocking considering how widespread its use is. Although it kind of makes sense that while designing the language , ease of use was given more weightage over making it airtight security wise.

There have been many attempts to make javascript secure, but the ones I am most familiar with are namely 2 approaches.

The first approach involves Google Caja , wherein the code is processed on a server prior to being sent to the client side.This processing includes breaking down the code into parse trees , removing unsafe parts, and then recombining the end code back to get a safe subset of javascript.

The second approach uses Aspect Oriented Programming, which is a programming paradigm which deals with including functions which run at specific point cuts within the program to ensure only safe methods from executing(can be implemented using libraries like 'jquery-aop' library, ..etc)

disclaimer:As this is a new subject even for me, the articles on this topic do not follow any structure and are mostly random. Making sense out of them might not be that easy. I will try my best to make them more cohesive by editing from time to time.

Overview of Google Caja
Starting to get the gist of what Google Caja is. As far as I can comprehend it's used to prevent javascript attacks. Google used it in Orkut. Facebook and microsoft have their own versions "fbjs" and "Microsoft sandbox".fbjs and microsoft sandbox have their basis in blacklisting approach,as far as I know.Throughout this post and the next, we will be saying host code for the code in the page already, that is the main site page which has to actually embed the 3rd party code in itself. The guest code is the javascript code of the 3rd party which has to be embedded safely in the host page so as to prevent javasript attacks.

uriPoliciy
Think of uriPolicy as an object which contains all the host specified policies which needs to be adheared by the guest code.The host creates these policies inside this object and then passes it as a parameter while loading the guest code.As far as I know, uriPolicy specifically is used to restrict access of the guest code to the network.

Google Caja has certain API's through which we specify the boundaries in which the guest code can execute. For example, we can create a callback function and store it inside the uriPolicy object which is called every time the guest code tries to access the network, like when the guest code tries to specify the src of an image in the < img src = " image source comes here " > tag. The host defined callback funtion can check whether the uri of image adhears to said defined policies.This is a sort of whitelisting approach.

A little background here, URI stands for Uniform Resource Identifier. It is the part you type in your browser's search box, like " http:/random-site/page5/ " . Note the use of URI instead of URL.In essence , URL is just one type of URI,the other being URN.I might be wrong about this, just look it up with another online source.

Taming
Moving on,let's throw some light on how the host's functions and variables are provided to the guest code.This feat is achieved through invoking caja.tame() on any object or function in the host code.

ex.
var t : caja.tame(f);
now, after executing this line of code, the guest code can access the variable f present in the original host code throught the name t.

Taming can be applied to records, arrays , variables, and also functions . Yes, the guest code can actually use the host's functions, provided they have been tamed.Note here that whatever kind of entity be tamed, the guest code calls it by the name specified while naming. like in the above example,for the guest code, variable f is represented by name t actually.

On an ending note,lets see where the api's that we are using come from and how can we access them.

To be able to use the API's of google caja, we first have to introduce this piece of code in our host page

<script type="text/javascript" src="www.caja.appspot.com/caja.js">
</script>

what this does is that it loads the caja.js file which contains the implementations of all the API's and hooks it to our page.So we can freely use all the functions defined inside the caja.js file.

---------------------------------------------END---------------------------------------------------------