Geo-replication
Geo-replication software is a network performance enhancing software technology that is designed to provide improved access to portal or intranet content for uses at the most remote parts of large organisations. It is based on the principle of storing complete replicas of portal content on local servers, and then keeping the content on those servers up-to-date using heavily compressed data updates.
Geo-replication technologies are used to provide replication of the content of portals such as Microsoft SharePoint or intranets between servers across wide area networks (WAN) to allow users at remote sites to access central content at LAN speeds.
Geo-replication solutions typically use a combination of content differencing and data compression to dramatically reduce the volume of data that has to be transmitted to keep portal content accurate across all servers. Some geo-replication technologies are able to reduce the data transmitted to keep portal content current across a global deployment by over 90%. This update compression dramatically reduces the load that portal traffic inflicts on networks, and significantly improves the end user experience of a portal by accelerating its performance.
To deliver this dramatic reduction in the size of the required data updates across a portal, geo-replication systems often use differencing engine technologies, such as Infonic's Epsilon compression system. This system is able to difference the content of each portal server right down to the byte level. This knowledge of the content that is already on each server enables the system to rebuild any changes to the content on one server, across each of the other servers in the deployment from content already hosted on those other servers. This type of differencing system ensures that no content, at the byte level, is ever sent to a server twice.
What does this mean in practice? Here is a very simplified example:
1) Lets say a document called "Cat.doc" exists on all the servers in a global portal deployment (called servers A, B, C and D.) This documents content is the sentence "The big fat cat sat on the nice brown mat".
2) Now lets say a new document is created by a local user of server C. He calls it "Dog.doc" and its content is the sentence "The big fat dog sat on the nice brown rug".
3) The differencing engine identifies that almost all of the content of the new document Dog.doc, which is only on server C, exists on servers A, B and D inside another document called Cat.doc.
4) It then identifies the elements of the content of Dog.doc that are different to the content of Cat.doc. These are the words "dog" and "rug".
4) The software then sends instructions to servers A, B and D on how to build a replica of Dog.doc from the content of Cat.doc by replacing the word "cat" with "dog" and "mat" with "rug".
Compression systems like Epsilon perform the task above, but at the byte level, meaning that no byte pattern is ever sent to a server on the network twice.
Geo-replication systems can often also employ a further technology called Content Virtualization to enable them to create replicas of server based portal content on devices such as laptops which do not have the storage capacity to create a genuine cache of the server content. Content Virtualisation enables mobile users to have access to a full replica of their business portal on a standard laptop.