Overview: The Project Repository & Metadata

Key Info:

  • Metadata is stored in the Project Repository, a Database in SQL Server.
  • Logical objects in Discovery Hub are UI representations of databases, tables, and fields in SQL Server.
  • When viewing tables outside of Discovery Hub, the "Valid" table (the table without the post fix) is the table populated with valid "cleansed" data.

A Metadata driven model

Discovery Hub does not actually house any databases, tables, fields, or rows of data, it simply manages all of the metadata about these objects. All of the databases, tables, and fields that you see in Discovery Hub are simply UI representations of the objects in SQL Server.

  • When you Save - metadata is written to the project repository database.
  • When you Deploy & Execute - tables and data are created and stored in the SQL Database (We will review Deployment & Execution more in the next section).

This design allows Discovery Hub to handle things like project import/export, version control, data lineage, documentation, & intelligent executions.

Project Repository

The Project repository settings are located under Tools > Options. This is where you define the location of your project repository database.

One project repository can contain multiple projects. Bringing you the flexibility to build out multiple data warehouse solutions with the convenience of having all of the metadata for these solutions stored in one location.

Database Object

The database object is the parent object within Discovery Hub that holds all the tables and fields in a specific Database.

This object should be named after the function of the object, Operational Data Exchange, Data Staging Area, Modern Data Warehouse Etc. However, it is much easier to read and manage this section if you abbreviate or use acronyms such as ODX, DSA, or MDW etc. You can rename a Database object at any time by right clicking the object and clicking Edit.

SQL Database

As opposed the database object, the SQL database is the underlying structure that will be housing the physical tables, fields, and data.

When naming the SQL database it is important to consider that this is the name that will appear in Management studio or other applications when users are selecting the database. It can be helpful to abbreviate or use acronyms, use _ (underscore) instead of space, include the name of the project, and include the name of the Database Object. For example, for a Modern Data Warehouse for a project Named Adventure Works, the name of the SQL Database would be "AW_MDW"

Renaming a SQL Database

SQL Databases cannot be renamed from within Discovery Hub. You can, however, rename a SQL database using SQL Server Management Studio. Once connected to your server in Management studio open the databases node, right click on the database you wish to rename and click rename.

Table Object vs SQL Tables, Views, and Procedures

A table object inside of Discovery Hub actually represents multiple objects in SQL and are viewed differently in SQL Server Management Studio.

Here is a table as it would be seen in Discovery Hub:
Here is the same table represented in SQL Server Management Studio:

We will review the purpose and function of these additional tables in the Advanced portion of this training. For now, just know that the table you will want to use is the "Valid" Table, the table without the post fix.