Introduction to X-Road (part 1)
Anto Veldre, Analyst
This documents aims to explain the essential nature of the X-Road using easy to understand language. It is a guide intended for those who want to acquaint themselves more thoroughly with the X-Road (e.g. theose who are working for an organization which will soon join the X-Road). It introduces the X-Road’s complex terminological system slowly and in a comprehensible way, so that the reader can painlessly achieve the preliminary familiarity with the X-Road necessary to engage with more explicitly technical and complex manuals.
Topics to be covered:
Part 1 (below):
- Modern governance – why do we need IT to govern a State?
- What are the necessary considerations in keeping databases secure?
- Why was X-Road created?
- Numbering people as a foundational principle of the modern e-State.
- What is a “security server” and what is its purpose?
- X-Road participants and their roles: Who does what and how is the bottom line determined?
- Explaining complex terminology and the final quiz.
Before we begin...
It is important to keep in mind that because all software is created to solve some particular problem or complete some specific task, every piece of software has a purpose. When discussing a particular piece of software one cannot forget about its context and the initial task that particular software was developed to solve.
Take the example of text editors—text editors clearly assume some level of literacy from their users. Further, to use a text editor to prepare a cover letter for a job application, one must be familiar with certain writing conventions: what to put in the heading, what to write in the subject line, etc. Another example: even if the average person was given access to working AutoCAD software, there is a high probability they would not be able to design a house with it because they do not understand how to take advantage of the tools available via AutoCAD. A final example is the grandmother who cannot play her grandchildrens’ computer game because she lacks critical knowledge about whom to hit and which key to use to “fire!”.
So too, could the essence of the X-Road and the rationale behind such a complex system remain incomprehensible to users who lack extra background information.
Below we will give the reader the essential knowledge necessary to make sense of the X-Road.
The State and Governance
A frequent confusion about the purpose of X-Road stems from the fact that this software is designed for State institutions and organizations to use in the course of exercising their duties, not for individuals’ home use.
For a State to function, several conditions must be met:
- Persons have to be identified (so that d'Artagnan cannot present himself as Swedish King). Each person should have a unique numeric identity.
- A numeric identity must be assigned to land properties, vehicles, businesses and addresses.
- Since persons, addresses, land properties are in constantly changing relationships with one another a lot of data has to be gathered on relationships, deals and procedural interactions (e.g.: who marries whom; whether rents and duties were paid, whether sales permit was issued or military conscription served).
This sort of data is highly complex; a variety of variables must be recorded, for example persons' names and personal data, family data, addresses, business relationships, titles, property maps etc.
Without systemized data of this kind we will not have a State but a pack of hunters-gatherers or an unorganized group of countrymen.
States have historically used clay tablets, parchment, and paper in governance. The modern State is maintained via computers and databases because these technologies allow for greater speed and complexity. We are able to form ever more complex associations and paper forms are constantly being replaced by databases and web forms. Just 30 years ago, obtaining certificates and attestation papers was a time-consuming task. Today, we print filled forms out straight from the Web. It seems a little unbelievable, but in Estonia one can leave his or her drivers' license at home—identities and driving privileges are checked electronically. Today, physical cards for social security are a thing of the past.
To attest somebody’s last will could take years of archival research in medieval times, but today, assuming the law contains no logical doublespeak and the genealogical relationships are properly recorded in a database, a similar request can be executed within seconds.
In summary, a state without data on its citizens is impossible in principle and today the quality of both governance and life quality are predetermined by the speed of services.
Bill of sale of a male slave and a building in Shuruppak. Sumerian tablet. Musée du Louvre, Marie-Lan Nguyen. (https://commons.wikimedia.org/wiki/File:Bill_of_sale_Louvre_AO3765.jpg)
When speaking about security, we must first turn to the issue of confidentiality, i.e. who can access the information and on what conditions. To a certain extent, data integrity threats will also be considered - i.e. can some actor change the information without leaving a trail and can the data be protected and retained intact (only 11 gnawed pages have been survived the oldest printed Estonian text – Wanradt-Koell Chatechism, 1535).
One of the biggest threats to the State is an uncontrolled centralization of data into a single database. Historians remember the Great library of Alexandria burned down. The risks inherent in this approach are clearly illustrated by the use of imported punch card sorters to select and target certain social groups in 1930s Germany [IBM and Holocaust, Edwin Black].
Another example of the threat posed by centralized databases manifested itself on September 26, 1996 when Estonian national television reported on the usage of an illegal superdatabase created by hacker Imre Perli. We were shown the search of home address by the owner's name, the financial obligations of the Prime Minister and other private data. Three months worth of call records metadata from the largest mobile provider were also compromised in the hack, but they were excluded from the broadcast. [See seaduslik ebaseaduslik info…, Armin Berlin, Luup (1998) nr. 21, lk. 44–47].
Why is such kind of a super database dangerous? In theory, the State should possess a monopoly on its citizens' data. The existence of centralized databases makes interference with this monopoly more likely. I recall driving around in Tallinn in 1996: when somebody broke the traffic rules or misparked, it was possible to look up his mobile number and call him right then – certainly a very frightening possibility. The State has reserved to itself certain rights, fellow citizens must not engage in intelligence activities.
Estonia has never officially recognized 1996’s huge data leak and the leak’s perpetrators have never been sued. Despite this, the hack was a learning experience and ever since Estonia has tried to avoid centralized databases. The risk of centralized databased has long been discussed in other countries. Cambridge information security professor Ross John Anderson even derived “Anderson’s Rule”: if a large system is designed for ease of access it becomes insecure; if made watertight it becomes impossible to use.
Another potential trap in database systems design is related to data integrity. Any government that attempts to record ALL of its citzens personal data will make some “mismatches” where personal data is updated in one database but not another. As a result, duplicate records about a person will appear with different data and it will not be possible to say which of those are accurate and which are not. Some states have been tried to avoid the issue by synchronizing multiple databases, but these efforts have been frustrated by time constraints – it is not so easy to transport Gigabytes across large distances. Imagine learning that your yearly credit limit is exceeded in one region but not in another, can you spot the problem with the approach that would let this happen?
Vice versa, after raising the requirement that some particular data must only reside in one certain database (lets call it “the original”), the data quality issues here evaporate.
An idealistic design
We probably should start stating that knowledge is power. The more data, the bigger power. The main task of a democracy, however, is to distribute power. That leads us to the logical notion that in a democratic society, the databases probably shouldn't grow too large. Each office should have its own database but, leaving entirely aside the issue of pirated copies of illegally collected data, they should not have the copies of the states’ remaining databases.
That notion leads us to the next issue: how to execute complex database inqueries? E.g. a homeless pet with microchip implant was found on the street, how do we determine its owner? To do this, we would need to send two requests: the first to the Pet Database and the second to the Population Register. This does not necessarily mean, however, that we need a trusted database specialist who has right to enter both government databases. Having such a specialist raises the same issues as the Super Database described above. Other problems with this approach include the need for several sequential requests (slow!) and the use of the public Internet (dangerous!).
The technology that we today call the X-Road grew out of Estonia’s efforts to solve these thorny issues and deal with these trade-offs in the late 1990s and early 2000s. So far, there haven't been any events that have severely hindered the X-Road. This resiliency suggests that the distributed architecture works remarkably well even in critical situations (like cyberwar).
Uuno Vallner recalls that the State Information Systems Department (at the Ministery of Economy and Communications) initiated the X-Road project around 1998. The pilot was ready in 2000 and was shown at a public conference. Three databases were interconnected and their data was exchanged according to the XML-RPC standard. The main development was done by Tanel Tammet, Hanno Krosing and Vello Kadarpik, all of whom are famous in Estonian IT. One of the biggest challenges facing these developers was whether to use the proprietary technology of a corporation or try to manage using the free software.
In those years, Uuno's main interest was the creation and deployment of SGML/XML based applications. Vello had used XML services while working in finance and those experiences led him to the idea of incorporating XML RPC applications into the public sector. In 2000, the State Information Systems Department ordered a thorough concept analysis from Professor Tammet (XML applications and XML-RPC based prototype). Later, a tender document had SOAP preference but the winner – Assert (Currently Fujitsu & Actors) gave preference to RPC – for the simple reason that SOAP technology hadn't yet reached maturity. The project was managed by Niilo Saard and Aleksander Reitsakas. One of the subcontractors for Assert was AS Cybernetica. They had a huge influence on the X-Road via their programming of the logs chaining module. Today Cybernetica has taken the lead role in the development of the X-Road. As an aside, the initial name of this technology wasn't “X-Road” but “Crossroad(s)” (Ristmik).
The X-Road was initially launched in 2000 in the cellar of the then existing Informatics Foundation, on Toompea Hill. Early X-Road personnel include: Ahto Kalja, Riho Oks, Juhan Vene, Martin Undusk, Andres Kollist, Uuno Vallner, Katrin, Peeter. Most of these individuals were employed by the State Information Systems Department but some were on the payroll of the Informatics Foundation (Imre).
One thing that is important to understand about the X-Road is that it is not a fundamentally new invention. Estonia simply harnessed then existing technologies and applied them in a novel way in the state governance context. The outcome of this novel application of existing technologies was named the X-Road.
Deeper scientific research on the e-State was proposed by Arne Ansper (currently CEO of AS Cybernetica) in 2001. Excerpted below is a two paragraph long quote from the research abstract:
This far, the Estonian public administration databases have been kept isolated from each other. The data exchange between them has been slow and inefficient. Fast and reliable data communication networks between state agencies removed the major obstacle on the way of tighter integration of public administration information systems. They have created a possibility to make communication between state agencies faster, safer, and more efficient. To exploit the advantages of new technology, public administration databases should be made accessible not only to one single agency, but rather, to all authorized persons who need that information for doing their jobs more efficiently (and thereby, for improving public services in general). Such a renewed Internet-based public administration is called e-State.
In this work, we analyze the security problems that arise when the public administration databases are opened for a widespread electronic access. The analysis is grounded on the current legal situation as defined by Estonian laws. We present separate analysis for agency-to-agency and for citizen-to-public-administration data exchanges. During the analysis we draw an important conclusion that, due to substantially different scopes of risks and the countermeasures available, security solutions developed for business organizations cannot be directly adopted for using them in public administration environment. As a result of the analysis, a model for the e-State architecture is presented that, together with appropriate legal framework, allows us to achieve the main security objectives.
These are the main postulates of X-Road:
- The solution shall work over the generic commodity Internet (it was cheaper than direct lines and Estonia was more cost-sensitive at the time of the X-Road’s founding).
- Each connected institution shall be strongly identifiable via a cryptographic certificate. This requirement helps prevent against the possibility that someone could enter (and then purge) the database via some airport WiFi.
- Cheap does not mean insecure – all the X-Road communications are encrypted. In contrast to a commodity VPN, which is constantly online, the X-Road puts up TLS/SSL tunnels only when data is actually being exchanged.
- Each data owner can enforce its own additional conditions. For example, one data owner might require that every person making a request on behalf of an Institution Certificate, must be 100% identified. This requirement would be easy to fulfill in Estonia because of personal numbers and Estonian eID cards.
- Actions are constantly logged, and logs are chained and have non-repudiation value. Requests are counted by persons that initiated those as well as by institutions. This counting helps to predict productivity bottlenecks as well as to monitor the security situation.
- There are no free-form requests like “find all black cats on the White Street” allowed. All request templates are pre-fabricated. This structure makes it much easier to log requests. A logged request might look like this: On the 15-th Day of September Anno Domini 2015, precisely at 13:47:12 and 85 micoseconds, the Person numbered as 37412029381 on behalf of the Business Register Entity no. 12345678 made a “Find-The-Owner-Of-The_pet.wsdl” (WSDL) request against the “Pet Register”.
Keep reading (pointers to the next paragraphs):