NULL and Its Tantrum in SQL
NULLs are always tricky to handle, Yes! you are right. But don’t worry I got something to explain it all. In SQL, “NULL” represents the absence of a value in a database table column. It is not the same as an empty string, zero, or any specific value. Instead, it is used to signify that […]
Column Level Encryption using PySpark
If you ever get a requirement to encrypt some of the column’s (Sensitive/Personal Identifiable Information) data before storing it anywhere, then you are at the right place.Below step by step code block can help you achieve this. In this demonstration, Fernet library will be used to generate key which will further be used to encrypt […]
Microsoft OneLake
OneLake is a single, unified, logical data lake for the whole organization. Like OneDrive, OneLake comes automatically with every Microsoft Fabric tenant and is designed to be the single place for all your analytics data. OneLake brings customers: -One data lake for the entire organization -One copy of data for use with multiple analytical engines […]
Microsoft Fabric Introduction
Microsoft Fabric is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place. Why Fabric To bridge the gap between data and intelligence Microsoft presents, Microsoft […]
Connect on-premises networks to Azure
VPN Gateway A virtual private network (VPN) is a type of private interconnected network. VPNs use an encrypted tunnel within another network. They’re typically deployed to connect two or more trusted private networks to one another over an untrusted network (typically the public Internet). Traffic is encrypted while traveling over the untrusted network to prevent […]
Azure Databricks Basics with Spark
Overview This blog post will provide an overview of Databricks, Azure Databricks, Apache spark fundamental. In this post you will learn about Databricks concepts (Workspace, Notebook, Cluster, Jobs, Scheduling etc.) and Spark fundamental will cover architecture and key features. Agenda Apache Spark Fundamentals Azure Databricks Pre-Requisites Understanding of Azure basic terminology Understanding of Big data […]
Azure Data Lake Catalog (U-SQL)
The Azure Data Lake Catalog (U-SQL) is one of the ADLA component by which U-SQL organizes data and code for sharing and re-use. The Catalog stores databases, tables, views, stored procedures, table-valued functions (TVFs), schemas, assemblies, external data sources and all other code-related items. Basically catalog are useful when there is a requirement of code […]
Deploy SQL Server Integration Services(SSIS) packages to Azure using Azure Data Factory(V2)
This tutorial provides all steps required to deploy SSIS packages and concepts : Azure Data Factory is a data integration service. It enables you to create data-driven workflows in the cloud. A workflow is implemented as one or more pipelines. The pipelines orchestrate and automate data movement and data transformation. Pipelines can perform the following […]