In Part 1, I describe what Data Virtualisation is and how one product, namely JBoss Data Virtualisation, works and its architecture. In this instalment, I am going to describe the implementation of a virtual database using 3 data sources. I am also going to show you how to interact with the virtual database using the SquirrelSQL client. The intention of this article is to give you some ideas as to how easy it is to aggregate several data sources to construct a virtual database. It explains the implementation on a high level only and does not give a step-by-step instruction of how to do it from scratch.
In many organisations, the information regarding a customer and his net worth is not stored in a central place. Instead, one has to go get information in several places to get a complete picture. In this demo, the data can be obtained from 3 data sources:
- Account Information including shares ownership (stored in a MySQL database)
- Market Data (stock price retrieved using a REST Web Service)
- Other Personal Holdings (stored in an XML file which we refer to as a Holdings XML earlier)
To grasp the concept on data virtualisation, please refer to my blog:
The MySQL database has been set up and populated with data and the web service has been deployed in EAP running JDV.
Applying the JDV best practice, for each data source, we create a source model as well as a view model. Additional virtual layers are created using the view models and not directly built on the source models. The source model and view model for each data source are created using the import wizard and Teiid Meta Models wizard respectively.
The MyPriceTable is composed of data from 2 views: stockPriceTable from priceTable view (from a web service) and the PRODUCT from MySQLView.
The virtual Holdings table in OtherHodingsView is created from the Holding table from the HoldingsView. There is no transformation. However, note that the Materialized property in the Properties tab has been set to true. Also notice that in the Transformation Editor, a cache hint has been added: /*+ cache(ttl:10000) */. It specifies that the cache (materialized view) has a time-to-live (ttl) duration of 10000 milliseconds or 10 seconds. This means that after 10 seconds if a query on this table takes place, it will trigger the reloading of the materialized view. Remember the OtherHoldings XML described earlier? This file is updated in the target directory by Fuse when a new document is sent by its Partner. Since the Holdings table will be refreshed every 10 seconds, this means the data in this table will be stale not more than t10 seconds (ttl value set).
Looking at the FederatedDB-1.vdb ie, the virtual database created for the demo, you will see that 2 roles have been defined: dba and guest. They are mapped to the login credentials.
For the guest data role, Column Masking has been set for the column “value” in the otherHoldingdView.Holdings table. This means that when queried by users with quest role, this column will return 8 stars as the column’s value. For dba data role, there is no such restriction.
Start Squirrel SQL client to connect to JDV FederatedVDB-1
Use the Squirrel client to show the virtual tables and the column masking that was set up for the otherHoldingdView.Holdings.value table (see diagram below)
I have given you a high level view on how to implement a business use case in which I aggregated the information from 3 data sources (RDBMS, web service and an XML file) into a virtual database to give a consolidated view of a customer. I also showed how to interact with the virtual database by using the SquirrelSQL client and one of JDV’s security feature, namely, column masking. There are a lot of other JDV functionality that I have not shown you. If you are in need of a data integration tool to pull information from various data sources to present a consolidated view of the information, JDV may be the tool for you. Here are a couple of useful links: