Overview — Why is it required to track changes?

Tracking changes helps us know who and what changes were made thereby ensuring better accuracy in data that we preserve.

Change tracking mechanisms can be developed by developers but that would involve sufficient development changes and modifications to schema resulting in significant development efforts. It would also have significant performance overhead. Hence, SQL Server provides different in-built tracking mechanisms like -

These built-in tracking mechanisms also have an overhead associated with them, but this is a lot less compared to custom-built tracking mechanisms.

Let’s try to compare and analyze the differences between…

Database design can be a tricky and laborious project and is often associated with stringent timelines. Below are some questions that you need to have answered before starting a database design initiative. In this article, we go through each of the below questions in a detailed manner.

•What are some things you should plan while doing your database design?

•What are the steps involved in a database design project?

•Should indexing be done before loading data into the database or after loading data into the database?

•When to use Row-store Indexing/When to use Column-store Indexing?

•What are some nice features…

Storage for a database includes the hardware, disk, and file layout. Designing storage for a database project can be exhaustive job as a lot of factors need to be taken into account.

Schematic 1.0 Database Storage

The most important aspects to be considered before desigining storage for a database are -

•Volume of data — Initial volume of data, and the rate of growth of data

• Data Retention Policy — Amount of historic data that needs to be maintained

•Number of Concurrent Users — How many users will try to read/write data at the same time.

•Workload — Refers to the number of…

Some pre-requisites before starting this hands-on are -

Let’s get started.

Step 1:

Log in to SSMS and enable the ‘Include Actual Execution Plan’ option.

Step 2:

Connect to the ‘WideWorldImporters’ database.

Step 3:

Right-click each of the tables — Application.Cities and Application.StateProvinces and see if they have indexes implemented on them. If they have indexes already, you will need to make a copy of the tables.

Note that when copies of a table are done…

Cognos 10.2 is very flexible when it comes to editing the underlying SQL logic. However, a lot of us would have experience a situation wherein the Generated SQL cannot be over-written. Below are some steps to perform to over-writing Generated SQL.

Click on ‘Report Data’

SSIS has some fuzzy transformation algorithms that help bring data from multiple sources and unify them under one roof within the data warehouse. The two types of Fuzzy Transformation Algorithms are –

1. Fuzzy Grouping

2. Fuzzy Lookup

While both tasks help with data standardization, let’s see what each one specifically does.

Fuzzy Grouping

The fuzzy grouping tasks helps with data de-duplication and in identifying a canonical row. Let us see what ‘canonical’ means, with an example.

Let’s assume that you have data in an excel file. This flat file has information about cities and provinces.

Note on the color highlights: The…

As years pass by, data starts piling up. If all data is stored in the same table, the table size will multiply massively over time leading to several issues. Hence, it is important for stakeholders of an organization to have a well-developed data retention policy. Data retention policy is that wherein stakeholders determine how much data to store. E.g. HealthCare organizations are mandated to store at least 6 years of data (HIPAA regulations). Similarly, insurance organizations need to store at least 6 years of data (Federal Compliance).

Schematic 1.0 Stakeholders’ hold a brainstorming and consensus session to discuss data retention

Sliding Windows — basic premise

The process of creating a strategy to effectively manage partitions such that the…

As data loads increase, you may be required to think of ways to help the server handle the load. There are 2 ways of doing this.

1. Scaling up

2. Scaling out

Background vector created by dooder — www.freepik.com

Scale-out Solution:

‘Scaling out’ ‘means adding multiple servers and spreading data across these servers. In other words, we distribute the load to multiple servers. This is analogous to Team-work.

Below are some questions that come to mind while partitioning your data tables-

By default, when an index is created on a partitioned table, the index is partitioned using the same partition schema of the table. You could either —

i. Choose a separate Parition scheme and Partition function for your index


ii. Choose the same Parition function and Partition scheme that you chose for the table, for the index as well.


iii. Choose a similar Partition function and Partition scheme for the index.

When you…

In an earlier article on Partitioning, we attempted creating partitions in an already existing table. This article provides a hands-on exercise in implementing partitions while creating a table.

To implement a table partition on a newly created table, the following is done –

•A partition scheme is created

•A partition function is created

•The partition scheme is applied as part of the ‘Create Table’ statement.

Let’s try this -

Step 1: Create a Partition Function

Step 2: Create a Partition Scheme

Step 3: Create table by applying the partition scheme on ‘Stock up Date Key’ column

Subha Ganapathi

An avid learner and a passionate worker.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store