Wednesday, October 16, 2013

Google Caja post-2

One thing I found about Google Caja while researching was that "It has a very bad documentation! " , atleast for an average guy.

The Way It Works is that the code is first passed onto the Caja Server , where the code is broken down into parse trees.Then the unsafe code is removed and remaining code is again put together so that it now belongs to a safe subset of javascript , i.e Vajita(check up name though, might be wrong name).


I have been working on what a host page template should look like . So far I have come up with something like this

<!-- host page -->
 <!-- where the caja server is -->
<html>
<head>
 <script type="text/javascript" src="www.caja.appspot.com/caja.js">
</script>
</head>

<body>
<p> CAJA HOST </p>
<div id="guest1"> </div>

<script type="text/javascript">
caja.markFunction(/* all those function which you want to provide to guest code */);

var tamedfunction_name = caja.tame(function_name);/*don't know difference between marking and taming */
/*format the url should adhere to */
var uriPolicy = { rewrite : function(uri) { if(true) { return uri; } else return undefined; } };

caja.initialize({ cajaServer:'https://caja.appspot.com/' , debug:true });

/*final guest code runs under div 'guest' */
caja.load( getElelemtById('guest1') , uriPolicy , function(frame) { frame.code('guest1.html','text/html') .api() .run();});
</script>

</body>
</html> 

Tuesday, October 8, 2013

Google Caja Post-1




Very recently I found out that javascript has a really bad design from  the point of view of security.This is shocking considering how widespread its use is. Although it kind of makes sense that while designing the language , ease of use was given more weightage over making it airtight security wise.

There have been many attempts to make javascript secure, but the ones I am most familiar with are namely 2 approaches.

The first approach involves Google Caja , wherein the code is processed on a server prior to being sent to the client side.This processing includes breaking down the code into parse trees , removing unsafe parts, and then recombining the end code back to get a safe subset of javascript.

The second approach uses Aspect Oriented Programming, which is a programming paradigm which deals with including functions which run at specific point cuts within the program to ensure only safe methods from executing(can be implemented using libraries like 'jquery-aop' library, ..etc)

disclaimer:As this is a new subject even for me, the articles on this topic do not follow any structure and are mostly random. Making sense out of them might not be that easy. I will try my best to make them more cohesive by editing from time to time.
                  
                                                                                                                                                                
Overview of Google Caja
Starting to get the gist of what Google Caja is. As far as I can comprehend it's used to prevent javascript attacks. Google used it in Orkut. Facebook and microsoft have their own versions "fbjs" and "Microsoft sandbox".fbjs and microsoft sandbox have their basis in blacklisting approach,as far as I know.Throughout this post and the next, we will be saying host code for the code in the page already, that is the main site page which has to actually embed the 3rd party code  in itself. The guest code is the javascript code of the 3rd party which has to be embedded safely in the host page so as to prevent javasript attacks. 

                                                                                                                                                               
uriPoliciy
Think of uriPolicy as an object which contains all the host specified policies which needs to be adheared by the guest code.The host creates these policies inside this object and then passes it as a parameter while loading the guest code.As far as I know, uriPolicy specifically is used to restrict access of the guest code  to the  network.

Google Caja has certain API's through which we specify the boundaries  in which the guest code can execute. For example, we can create  a callback function and store it inside the uriPolicy object which is called every time the guest code tries to access the network, like when the guest code tries to specify the src of an image in the  < img  src = " image source comes here " >  tag. The host defined callback funtion can check whether the uri  of image  adhears to said defined policies.This is a sort of whitelisting approach.

 A little background here, URI stands for Uniform Resource Identifier. It is the part you type in your browser's search box, like  " http:/random-site/page5/ " . Note the use of URI instead of URL.In essence , URL is just one type of  URI,the other being URN.I might be wrong about this, just look it up with another online source.

                                                                                                                                                              
Taming
Moving on,let's throw some light on how the host's functions and variables are provided to the guest code.This feat is achieved through invoking caja.tame() on any object or function in the host code.

ex.
                         var t : caja.tame(f);
 now, after executing this line of code, the guest code can access the variable f present in the original host code throught the name t.

Taming can be applied to records, arrays , variables, and also functions . Yes, the guest code can actually use the host's functions, provided they have been tamed.Note here that whatever kind of entity be tamed, the guest code calls it by the name specified while naming. like in the above example,for the guest code, variable f is represented by name t actually.

On an ending note,lets see where the api's that we are using come from and how can we access them.

To be able to use the  API's of google caja, we first have to introduce this piece of code in our host page
   
                     <script type="text/javascript" src="www.caja.appspot.com/caja.js">
                      </script>

what this does is that it loads the caja.js file which contains the implementations of all the API's and hooks it to our page.So we can freely use all the functions defined inside the caja.js file.

---------------------------------------------END---------------------------------------------------------