- Liquid Clustering 101: What Every Databricks Developer Should Know
In the ever-evolving world of data management, Databricks has unveiled a game-changer: Liquid Clustering for Delta Lake. Imagine a dynamic data layout approach that not only simplifies your data decisions but also supercharges your query performance. Dive into this article to unlock the secrets of Liquid Clustering, a feature that promises to redefine how we think about data layout in Delta Lake. Whether you’re a data enthusiast or a seasoned professional, get ready to embark on a journey of discovery and innovation. Let’s dive deep into the world of Databricks Liquid Clustering and explore its transformative potential!
In the ever-evolving world of data, organizations are constantly faced with the challenge of selecting the optimal format for their data lakehouses. With a plethora of options available, such as the Linux Foundation Delta Lake, Apache Iceberg, and Apache Hudi, the decision-making process can be overwhelming. Enter Delta UniForm, a game-changer in the realm of data interoperability. In this blog, we’ll delve deep into the world of Delta UniForm and its transformative impact on the data ecosystem.
Imagine having a single place where all types of data, from numbers to social media posts, can be stored and understood. Traditional methods like data warehouses are good with numbers but struggle with other types of data. Data lakes can hold everything but can get messy. Databricks saw this gap and introduced Lakehouse, combining the best of both.
Databricks Lakehouse Apps go even further. They’re like smart tools that help different teams – from tech experts to business folks – work with data easily. Data engineers can organize data from different places, data scientists can find patterns and make predictions, and business teams can create graphs and charts to see what’s happening.
In the vast realm of cloud computing, Azure Synapse stands tall as a game-changer, offering unmatched analytics service capabilities. But with great power comes great responsibility, especially when it comes to networking. To some, navigating Azure Synapse Networking might feel like deciphering a complex maze. But here’s the good news! This blog is your compass, designed to demystify the enigmas, streamline the complexities, and illuminate the core essence of Azure Synapse Networking. Whether you’re an Azure aficionado or a newcomer exploring cloud analytics, this guide is your ticket to a transformative experience.
In this comprehensive guide, we will walk you through the entire process of creating a Python Wheel file (Python Packages) using PyCharm. But we won’t stop there; we’ll also show you how to deploy this Wheel file to a Databricks Cluster Library. Finally, you’ll learn how to call a function from this package within a Databricks Notebook.
This article picks up where the previous one left off, titled “Exploring Apache Spark 3.4 Features for Databricks Runtime.” In the earlier article, I discussed 8 features. Now, in this article, we’ll delve into additional prominent features that offer significant value to developers aiming for optimized outcomes.
Navigating complex data workflows can be tough, with uncertainties at every turn. Ensuring data accuracy, finding performance issues, and keeping pipelines reliable can be tough tasks. Without strong monitoring and alerting tools, these problems can turn into time-consuming hurdles. Databricks understands these difficulties and provides developers with tools to spot issues early, enhance performance, and keep data journeys on track.
In the dynamic landscape of big data and analytics, staying at the forefront of technology is essential for organizations aiming to harness the full potential of their data-driven initiatives. Apache Spark, the powerful open-source data processing and analytics framework, continues to evolve with each new release, bringing enhancements and innovations that drive the capabilities of data professionals further.
Step into the future of data management with the revolutionary Lakehouse Federation. Envision a world where data lakes and data warehouses merge, creating a formidable powerhouse for data handling. In today’s digital age, where data pours in from every corner, relying on traditional methods can leave you in the lurch. Enter Lakehouse Federation, a game-changer that harnesses the best of both worlds, ensuring swift insights, seamless data integration, and accelerated decision-making.
Dive into this article to unravel the magic behind Lakehouse Federation. Discover its unmatched advantages, journey through real-world applications, and master the art of leveraging it. By the time you reach the end, you’ll be equipped with the knowledge to transform your data strategies and set the stage for unparalleled success.
Exciting news! The Databricks CLI has undergone a remarkable transformation, becoming a full-blown revolution. Now, it covers all Databricks REST API operations and supports every Databricks authentication type. The best part? Windows users can join in on the exhilarating journey and install the new CLI with Homebrew, just like macOS and Linux users.
Are you tired of dealing with complex code and confusing commands when working with Apache Spark? Well, get ready to say goodbye to all that hassle! The English SDK for Spark is here to save the day.
With the English SDK, you don’t need to be a coding expert anymore. Say farewell to the technical jargon and endless configurations. Instead, use simple English instructions to communicate with Apache Spark.
Imagine a world where your data is always ready for analysis, with complex queries stored in an optimized format. However, this process consumes a significant amount of time. Now, there’s no need to wait; experience high-speed and efficient data handling. This is what materialized views can bring to your data analysis workflow. Materialized views offer a solution. Would you like to uncover the revolutionary power of materialized views in the world of data analysis?
With Databricks Unity Catalog’s volumes feature, managing data has become a breeze. Regardless of the format or location, the organization can now effortlessly access and organize its data. This newfound simplicity and organization streamline data management, empowering the company to make better-informed decisions and uncover valuable insights from their data resources.
As a beginner software developer, you may find the process of writing long and monotonous code both boring and time-consuming. You might wonder if there are any AI tools available that can alleviate these coding challenges and make your work easier. Well, the answer is yes, and that’s where GitHub Copilot comes into the picture.
Are you familiar with the power of Azure Kubernetes Service (AKS) clusters? They provide a rock-solid foundation for your applications, ensuring seamless accessibility and smooth operations. But what happens when disaster strikes? While AKS offers high availability within a Virtual Machine Scale Set, it can’t protect you from a destructive region failure. Imagine the consequences if an entire Azure region goes down, leaving your nodes and resources in the dark.
Are you struggling to find the information you need to be buried deep within unstructured documents? It’s time to unleash the power of Azure OpenAI and Service Embedding! In this blog post, we’ll show you how to harness the latest advancements in AI and cloud-based services to transform your unstructured document search.
Databricks Unity Catalog provides a powerful solution that enables teams to efficiently manage and collaborate on their data assets. By implementing best practices for utilizing Databricks Unity Catalog, organizations can unlock the full potential of their data and enhance collaboration across teams. In this article, we will explore the best practices for streamlining data management using Databricks Unity Catalog and how it can revolutionize your organization’s data-driven workflows.
Businesses are grappling with a massive influx of data and diverse technologies, making it challenging to streamline operations and extract valuable insights. Recognizing these hurdles, Microsoft has developed a groundbreaking solution: Microsoft Fabric.
Organizations are constantly seeking powerful solutions to unlock the highest potential of their data assets. One such solution is Delta Lake. With its unique combination of reliability, scalability, and performance, Delta Lake has revolutionized the way data lakes are managed and utilized. In this article, we will go into the depths of Delta Lake’s best practices, exploring the strategies and techniques that can boost your data management to new heights.
The Internet of Things (IoT) has revolutionized the way we interact with the physical world, creating vast opportunities for businesses across various industries. As IoT devices continue to proliferate, the need for robust connectivity, efficient data management, and seamless integration becomes paramount.
In today’s world of endless information, we are on a mission to set data free. LangChain, in collaboration with Azure OpenAI, has the ability to comprehend and generate text that closely resembles human language. This has the potential to transform the way we analyze data. By combining these technologies, organizations gain the ability to harness data for making thoughtful decisions. Are you tired of poring over endless spreadsheets and databases in search of the information you need? Imagine being able to simply ask a chatbot a question and get instant results from your database. It sounds like science fiction, but with Azure OpenAI and Azure SQL, it’s a reality! In this session, we’ll show you how to unlock the power of conversational AI to make data more accessible and user-friendly.
In a world where artificial intelligence is reshaping industries and transforming the way we live, Azure OpenAI stands at the forefront, embodying the boundless possibilities of this futuristic technology. But what does Azure OpenAI truly represent? It is an exciting combination of Microsoft’s cloud infrastructure with OpenAI’s state-of-the-art AI models, creating a fusion of innovation and intelligence. With a single question, Azure OpenAI can delve into vast amounts of data, make predictions, understand human language, and even generate human-like text—all at an excellent scale. Join our journey in Mastering Azure OpenAI Service to uncover the secret of prompt engineering and learn the potential of Azure OpenAI services. Furthermore, as we conclude this article, we will provide you with a quick guide on how to create your own AI Service using OpenAI. This straightforward guide will assist you in setting up your personal AI Service, empowering you to delve into the world of AI with ease and confidence.
The process of developing and deploying applications is complex, time-consuming, and often error-prone. The use of release pipelines helps to streamline this process and automate the deployment of code and data. Databricks is a popular cloud-based platform used for data engineering, data science, and machine learning tasks. Azure DevOps is a powerful tool for managing the entire software development lifecycle, including build and release management. In the blog “Streamline Databricks Workflows with Azure DevOps Release Pipelines”, we will explore how to build release pipelines for Databricks using Azure DevOps. We will look at the steps required to set up a pipeline for Databricks. By the end of this post, you will have a good understanding of how to build efficient and reliable release pipelines for Databricks using Azure DevOps.
In today’s increasingly connected world, businesses of all sizes rely on cloud computing to store, process, and analyze their data. As a result, ensuring seamless connectivity between different regions and subscriptions within the cloud infrastructure is critical. One of the most effective ways to achieve this is by configuring Virtual Network Gateway (VNG) connections. However, setting up VNG connections across different regions and subscriptions can be a complex and daunting task for even the most experienced IT professionals. In the blog post “Configuring Virtual Network Gateway Connections Across Regions and Subscriptions”, we’ll learn to configure VNG connections in Azure to enhance network performance and strengthen security.
As the world continues to generate massive amounts of data, artificial intelligence (AI) is becoming increasingly important in helping businesses and organizations make sense of it all. One of the biggest challenges in AI development is the creation of large language models that can process and analyze vast amounts of text data. That’s where Databricks Dolly comes in. This new project from Databricks is set to revolutionize the way language models are developed and deployed, paving the way for more sophisticated NLP models and advancing the future of AI technology. In the article “Unlocking the Potential of AI: How Databricks Dolly is Democratizing LLMs”, we’ll dive deeper into what makes Databricks Dolly so special and explore the potential impact it could have on the future of AI.
Data is the backbone of modern businesses, and processing it efficiently is critical for success. However, as data projects grow in complexity, managing code changes and deployments becomes increasingly difficult. That’s where Continuous Integration and Continuous Delivery (CI/CD) come in. By automating the code deployment process, you can streamline your data pipelines, reduce errors, and improve efficiency. If you’re using Azure DevOps to implement CI/CD on Azure Databricks, you’re in the right place. In this blog, we’ll show you how to set up CI/CD on Azure Databricks using Azure DevOps to improve efficiency, maximize collaboration and productivity, and unlock your team’s full potential and produce better results. Let’s get started!
Managing resources in the cloud can be a challenging task, especially when it comes to organizing and grouping your resources effectively. Azure tags are an essential part of Azure resource management, allowing for easy identification and grouping of resources. However, applying tags to multiple resources across different subscriptions can be daunting, especially if you’re doing it manually. If you’re looking for a scalable and configurable solution to manage resource tags across multiple Azure subscriptions, then this blog post is for you! In this article, “Scaling Resource Tagging in Azure: A Configurable Solution for Multiple Subscriptions and Tags” we’ll introduce you to an approach that enables technical architects and developers to reliably tag resources in Microsoft’s cloud computing platform using configuration as code settings. This approach facilitates resource tagging consistency within an organization’s Azure tenant, allowing IT administrators to define and update their services through custom configurations that are easy to set up, audit, scale out when needed, and maintain over time without additional help.
Microsoft Azure offers a range of services and solutions for various scenarios and needs. One of the essential aspects of any cloud service is the ability to back up and restore data and applications in case of any failure, disaster, or human error. Azure provides several backup options for different types of resources, such as virtual machines, databases, files, blobs, and web apps. The article “Microsoft Azure Backup Options: Which One Fits Your Needs Best?” will give you exposure to some of the backup options available in Azure and how they can help you protect your data and applications.
Are you tired of spending hours on routine tasks in Azure? As a system administrator or developer, you know that time is precious. But what if we told you that there’s a superhero that can save you time and effort? That’s right – we’re talking about PowerShell workflow automation in Azure! With just a few lines of code, you can streamline your tasks, deploy and manage resources, and monitor performance. And the best part? We’re here to share some insider tips, tricks, and scripts that will help you unleash the power of PowerShell automation in Azure. So get ready to supercharge your workflow and say goodbye to tedious tasks!
As more and more companies turn to the cloud for their data processing needs, choosing the right platform can be a crucial decision. Two of the most popular cloud-based data platforms are Snowflake and Databricks, and understanding the differences between them can be challenging. However, by closely examining the features and advantages of each platform, you can make an informed decision about which one suits your business best. In this article, we’ll explore the key differences between Databricks and Snowflake, and help you decide which platform is right for your data processing needs.
Migrating on-premises databases to Azure SQL database is becoming a great option for good scalability, cost savings, and flexibility. However, the migration process can be complex, time-consuming, and pose significant risks if not executed properly. In this article, we will provide you with everything you need to know about migrating on-premises databases to Azure SQL databases. We will discuss the benefits of migrating to Azure SQL and will provide tips and tricks to help you plan and execute a successful migration and to ensure a smooth transition to the cloud.
As more businesses shift their operations to the cloud, keeping a close eye on the performance and reliability of their applications becomes increasingly important. This is where monitoring and alerting come into play, and in this article, we’ll take a closer look at how they can be used in Azure to ensure that your applications and services are operating smoothly. Whether you’re an experienced Azure user or just starting out, you’ll find plenty of valuable information here on the best tools and techniques for monitoring and alerting in Azure. With these tools at your disposal, you can keep your applications running smoothly 24/7, ensuring that your business stays ahead of the game.
As the world becomes increasingly digital, more and more businesses are turning to cloud computing to store and manage their data. While this technology has a number of advantages, it can also be expensive if not managed properly. With the right cost management strategies, however, businesses can optimize their cloud environments and achieve significant cost savings.
In this article, we’ll explore some of the key cost management strategies that businesses can use to keep their cloud spending under control. From setting budgets to leveraging automation tools, we’ll provide tips and advice to help you get the most out of your cloud investment without breaking the bank. So whether you’re a small startup or a large enterprise, read on to discover how you can optimize your cloud cost management strategies and save money in the long run.
Part 1 “Boost your Snowflake Queries: Top Strategies for Faster Results”, we discussed the concepts of Query Optimization, the Snowflake Query Processing Layer, and the Query Optimization Techniques, including Snowflake Search Optimization Service (SOS), Minimize Data Movement, Use of appropriate Data Types, Use of Materialized Views, Using Clustering Keys and Use of Query Profiling.
In this article “Optimizing Snowflake Queries: Boosting Performance”, we will continue our exploration of Snowflake Query Optimization Techniques. These techniques can further improve the performance and efficiency of Snowflake queries, making them faster and more cost-effective. Let’s dive in!
Snowflake is a cloud-based data warehousing solution that offers unlimited scale, concurrency, and performance. However, even with all of its advanced capabilities, Snowflake query performance can still be impacted by large volumes of data and complex queries. That’s where query optimization comes in. By fine-tuning queries to minimize the amount of data scanned and processed, Snowflake users can significantly improve query performance and reduce costs. In this article, we will guide you to Boost Your Snowflake Queries: Top Strategies for Faster Results.
In today’s digital world, storing data has become an essential requirement for businesses and individuals alike. With an array of options available, choosing the right storage solution can be overwhelming. Three popular storage options are Azure Blob Storage, File Storage, and Disk Storage, each with its unique features and benefits. But which one is right for you? This question can be a challenging one to answer but fear not. In this article, we’ll explore the differences between Azure Blob Storage, File Storage, and Disk Storage, helping you make an informed decision based on your storage needs. So let’s dive in and find the perfect storage solution for you.
Disasters are unavoidable. Whether it’s a natural calamity, a cyberattack, or a human error, any event that disrupts your business operations can have serious consequences. You may lose data, revenue, reputation, and customer trust. That’s why you need a disaster recovery plan that can help you restore your services and data as quickly and smoothly as possible. In this article, we will provide an Ultimate Guide to Disaster Recovery in Azure: Safeguard Your Data with Expert Tips.
Kubernetes has revolutionized the way we manage containerized applications, making it easier than ever to deploy, scale, and manage complex microservices architectures. But while Kubernetes provides a powerful platform for running applications, it can be challenging to expose those applications to the outside world. That’s where Kubernetes Ingress comes in – a powerful and flexible way to manage external access to services running in a Kubernetes cluster. With Ingress, you can define routing rules for incoming traffic, making it easy to expose your services to the outside world and enabling a wide range of use cases for cloud-native applications. Maximizing Your App Performance with Azure Kubernetes Ingress Controller is an essential guide for anyone seeking to unlock the full potential of their Azure Kubernetes deployment and achieve maximum performance, scalability, and reliability for their applications. In this article, we’ll explore Services in Kubernetes, Kubernetes Ingress and its features, Kubernetes Ingress Controller, and the various ingress controllers available.
As more and more businesses shift towards the cloud, virtual machines have become a crucial aspect of modern computing. Whether you’re running a small-scale operation or a large enterprise, the ability to fine-tune the performance of your virtual machines can be the difference between success and failure. In this blog, we’ll be sharing some valuable tips and tricks to help you maximize the performance of your Azure Virtual Machines and take your business to the next level and ensure that your workloads are running as smoothly and efficiently as possible. Whether you’re a seasoned Azure user or just getting started, these tips are sure to help you get the most out of your virtual machines and take your cloud computing to the next level. So, let’s dive in!
As many companies have moved their database and sensitive information to the cloud, it is important to have a solid understanding of how data flows in and out of your cloud environment. In Microsoft Azure, managing inbound and outbound traffic is an important aspect of ensuring optimal performance, security and cost-effectiveness.
Are you tired of sifting through a cluttered Databricks Workspace to find the notebook or cluster you need? Do you want to optimize your team’s productivity and streamline your workflow? Look no further! In this guide, we’ll share valuable Tips and Best Practices for Organizing your Databricks Workspace like a pro. Whether you’re a seasoned Databricks user or just getting started, these tips will help you keep your Workspace tidy, efficient, and easy to navigate. So let’s get started and revolutionize the way you work with Databricks!
Azure Virtual Network is a powerful tool that allows you to create and manage your virtual network in the cloud. One of the key features of Azure Virtual Network is Subnet Delegation, which enables you to delegate control over specific subnets to Azure services. Subnet Delegation allows an Azure service to have its permissions and access controls, making it easier to manage and secure network infrastructure. In this article, we will explain Subnet Delegation, how it works in Azure Virtual Networks, and the benefits of managing complex network infrastructures in the cloud.
Deploying applications on Azure Kubernetes Services (AKS) can be a complex process, but with the right strategies in place, it can also be highly efficient and effective. In this comprehensive guide, we will explore the ultimate deployment strategies for AKS that will help you take your applications from concept to reality. Whether you’re a beginner or an experienced user of AKS, this guide will provide you with all the information you need to optimize your deployments and achieve maximum performance in the cloud. So let’s dive into the ultimate guide to Azure Kubernetes Services deployment strategies!
The cloud computing landscape is constantly evolving, with new technologies and tools emerging all the time. One such innovation that has caught the attention of developers and IT professionals alike is Azure Kubernetes Virtual Nodes. This powerful technology promises to revolutionize the way we think about scaling and managing our cloud applications. In this article, we’ll explore what Azure Kubernetes Virtual Nodes are, how they work, and why they have the potential to transform the cloud computing industry as we know it. So buckle up and get ready to discover a game-changing tool that can take your cloud operations to the next level
Ready to take your data processing to the next level? Look no further than our Ultimate Databricks Performance Optimization Guide! In this comprehensive guide, we’ll show you how to turbocharge your data and achieve lightning-fast processing speeds with Databricks. From optimizing your clusters to fine-tuning your queries and leveraging cutting-edge performance optimization techniques, we’ll cover everything you need to know to unlock the full potential of Databricks. Whether you’re a seasoned big data pro or just starting out, our expert tips and tricks will help you achieve peak performance and take your data processing to new heights. So buckle up and get ready for the ultimate ride through the world of Databricks performance optimization!
Are you tired of waiting for your big data processing to finish? Do you want to unlock the full potential of Databricks and take your performance from zero to hero? Look no further! In this guide, we’ll take you on a fast-paced journey through the world of Databricks performance optimization. We’ll show you how to fine-tune your queries, optimize your clusters, and leverage cutting-edge features like External shuffling to achieve lightning-fast processing speeds. With our expert tips and tricks, you’ll be well on your way to mastering Databricks performance optimization and achieving big data success in record time. Get ready to hit the fast lane and leave sluggish performance behind!
Are you tired of waiting around for your big data to process? It’s time to take matters into your own hands and optimize your Databricks performance like a pro! With the right tips and tricks, you can transform sluggish data processing into lightning-fast insights. In this guide, we’ll show you how to go from slow to go with Databricks performance optimization. Get ready to supercharge your big data processing and unlock the full potential of your business’s data-driven decisions!
Do you want to supercharge your data processing and analytics with Databricks? Are you tired of slow and inefficient Spark jobs that waste your valuable time and resources? Look no further, because, in this blog, we’ll show you how to boost your Databricks performance for maximum results! Whether you’re a data scientist, engineer, or analyst, you’ll learn practical tips and best practices to optimize your Databricks cluster, tune your Spark jobs, and leverage advanced features to accelerate your data pipeline. With the tips provided in this blog, you can take your data processing to the next level and achieve lightning-fast results that will wow your stakeholders. Let’s dive in and turbocharge your Databricks performance today!
Are you struggling with the speed and performance of your Azure Web App? Don’t let slowdowns keep you from getting the most out of your Azure web application. There are many ways to solve this issue and speed up your site. This blog will help you how to troubleshoot the performance issues and identify the root causes that are making your Azure Web Application slower. The techniques discussed in this blog will help you to drastically improve your azure web app performance. Learn these tips and apply them in your project to optimize your Web app performance quickly and easily for maximum results.
Are you looking to set up a CI/CD pipeline for AKS (Azure Kubernetes Service) but don’t know where to start? Look no further. In this article, we will cover the basics of setting up a CI/CD pipeline with Azure DevOps – from creating builds and releases, deploying resources, automating deployment processes with Azure Pipelines, and best practices for configuring pipelines. By the end of this guide, you’ll have everything you need to get your CI/CD pipeline up and running in no time! So let’s dive right in!
Do you want to maximize efficiency when scaling containers and applications? Horizontal autoscaling on Azure Kubernetes Service (AKS) provides a powerful, efficient way of keeping up with changing workloads. Not only is it quick and easy to set up, but it allows for near-instant responses to any changes in demand, so your application remains consistent regardless of how many users you have accessing the system. Let’s take a look at some of the key features horizontal autoscaling offers developers on AKS, as well as best practices for configuring and managing these resources.
Do you have a big data workload that needs to be managed efficiently and effectively? Are the current SQL workflows falling short? Writing robust Databricks SQL workflows is key to get the most out of your data and ensure maximum efficiency. Getting started with writing these powerful workflow can appear daunting, but it doesn’t have to be. This blog post will provide an introduction into leveraging the capabilities of Databricks SQL in your workflow and equip you with best practices for developing powerful Databricks SQL workflows
Are you considering using Kubernetes to manage containerized applications in the cloud? If so, one of the key challenges you may face is ensuring that your applications can scale rapidly and efficiently to meet demand. Thankfully, with Azure’s automated scaling solution for Kubernetes cluster service—Azure Kubernetes Service Autoscaler (AKSA)—you can set up flexible autoscaling rules quickly and easily so all containers are automatically scaled up or down as needed. In this blog post, we’ll dive deeper into AKSA and explore why it’s such a powerful tool for managing workloads within an increasingly dynamic IT landscape.
Databricks Workflows is a powerful tool that enables data engineers and scientists to orchestrate the execution of complex data pipelines. It provides an easy-to-use graphical interface for creating, managing, and monitoring end-to-end workflows with minimal effort. With Databricks Workflows, users can design their own custom pipelines while taking advantage of features such as scheduling, logging, error handling, security policies, and more. In this blog, we will provide an introduction to Databricks Workflows and discuss how it can be used to create efficient data processing solutions.
As a data and AI engineer, you are tasked with ensuring that all operations run smoothly. But how do you ensure that the information stored in the Azure Databricks is managed correctly? The answer lies in its Unity Catalog, which is dedicated to providing users with a central catalog of tables, views, and files for easy retrieval. In this blog post, we’ll be demystifying what an Azure Databricks Unity Catalog really does and discussing best practices on utilizing it for governance within your organization’s data & analytics environment.
Microsoft’s Azure Synapse Analytics platform is a powerful tool for storing, analyzing, and reporting on data. But as with any cloud-based service, you need to keep an eye on your costs. Fortunately, you can use Azure Automation to optimize your cost by automating certain tasks. Let’s take a closer look at how this works.
In recent times, Databricks has created lots of buzz in the industry. Databricks lays out the strong foundation of Data engineering, AI & ML, and streaming capabilities under one umbrella. Databricks Lakehouse is essential for a large enterprise that wants to simplify the data estate without vendor lock-in. In this blog, we will learn what Databricks Lakehouse is and why it is important to understand this advanced platform if you want to streamline your data engineering and AI workloads.
Service now is an excellent tool for IT service management. But have you come across a situation where your most precious time is wasted in raising the service now tickets (Change Ticket, Incidents, and Service Tickets)? This becomes quite boring and inefficient. Especially when you have to go thru this ordeal very often because your work depends upon other teams. Did you always imagine being happier if you could offload this boring stuff to somebody else? Sounds familiar?
If you want to automate this monotonous stuff and become more productive, then this blog is for you.
In this blog, we will learn how to automate Service Now ticket with Microsoft Power Automate and Power Virtual Agent.
If you want to develop an Intelligent chatbot in Azure Bot Service, then this blog is for you. In this Azure AI Chatbot Tutorial, we will learn how to integrate Natural Language Processing Capabilities in the chatbot. Some good chatbot use cases can revolutionize the way we do business. If you are new to Azure AI concepts then you will learn complex concepts like intents, Utterances, Entities, and their use in Chatbot. In this blog, we will not only learn how to develop a chatbot but also how to make it more intelligent. We will explore how an intelligent chatbot can authenticate against Azure and execute the commands remotely.
This is part two of a series of blogs for Databricks Delta Live tables.In part one of the blog we have discussed the basic concepts and terminology related to Databricks Delta Live tables. In this blog, we will learn how to implement Databricks Delta Live Table in three easy steps.
In this blog, I have discussed the Databricks Lakehouse platform and its Architecture. What are the challenges involved in building the data pipelines and how Databricks Delta Live Table solves them?
How Delta live table offers ease of development and treats your data as a code. With Delta Live tables now, you can build reliable maintenance-free pipelines with excellent workflow capabilities.
We will learn the different concepts and terminology used in Delta Live tables and its unique monitoring capabilities.
In this blog, I have discussed how to implement lineage, insights (reporting), and monitoring capabilities in Microsoft Purview.
First, we will understand what Lineage is and why it is important. Then, we will understand Purview’s insights capabilities and how purview provides the unique capabilities of reporting for Assets, Scans, Glossary, classification, and Sensitivity Labels.
Finally, we will gain knowledge on why it is important to monitor the purview environment and how to monitor it based on best practices.
In this blog we have discussed Microsoft Purview Search, Glossary and classification capabilities with three demo scenarios.
In this blog we will learn how to register and scan ADLS Gen 2 and Azure SQL Database in Microsoft Purview
Azure Kubernetes Services is the fastest way to use Kubernetes on Azure. Azure Kubernetes Service (AKS) manages the hosted Kubernetes environment, making it easy to deploy and manage containerized applications without requiring any container orchestration expertise. It also improves the agility, scalability, and availability of your containerized workloads. Azure DevOps streamlines AKS operations by providing continuous build and deployment capabilities.
In this blog, we will use Azure DevOps to deploy a containerized ASP.NET Core web application to an AKS cluster. The steps used in this blog can be used to deploy any application to AKS. The entire end-to-end demo is available in the video link provided in the blog.
When you want to develop and implement the container application in Azure. The first and foremost step you would execute is to build the images and push them into the Azure Container registry. In this article, I will explain how to achieve this objective.
In any large-scale implementation of AKS (Azure Kubernetes Services), we need to use an image repository to store container images securely. So whenever you want to deploy the images on the Kubernetes cluster you will deploy the images stored in the image repository. In this article, we will learn how to integrate the Azure-based image repository called Azure Container Registry(ACR) with Azure Kubernetes Services(AKS) in the most simple manner.
When it comes to DevOps Docker is an integral part of it. Nowadays no development can be done without the help of docker. In this article, we will discuss how can we use Azure DevOps Pipeline to build and push images to the Azure container registry.
This blog discusses Azure security design and consideration for securing access to Azure Services.
I have been using Azure Data Factory to ingest the files into ADLS Gen 2 for processing. Lately, I found many challenges when we use ADF for file ingestion. SO Let’s resolve these challenges with Databricks’s Autoloader.
While designing the azure landing zone we need to ensure that our network is secured.VNet protects inbound flow (from users) and outbound traffic flow (to the Internet). Now the question arises how do we secure this traffic? Azure provides services like Azure firewall and Azure Application Gateway. It is very confusing when to use Azure firewall vs. Azure Application Gateway. There can be other combinations that can make the design more and more complex. This article provides the definitive guide and scenarios-based approach to help what design should be used. When it should be used? How it should be used?
In this blog, we will discuss how to troubleshoot the user-defined route in Azure. I have faced this issue in one of my projects. Typically when you want to test the traffic from a specific VM you will have to log in to the VM and see the output of the Traceroute command and it becomes cumbersome if you have so many routes because now you have to log in to each VM to verify whether the routes are working correctly or not. Another problem is that even if the routes are not working traceroute will not show why it is not working. So if you do not know why routes are not working you can not fix anything. To overcome this issue I wrote a small script that can be used as it is by changing the parameters and it will display the connectivity status (success or failure) if there is an issue then this script will also show what is causing that issue.
Recently I have come across a requirement to design the Azure landing zone for a customer who wants to migrate their workloads from on-premise to Azure. This article explains the best practices implemented in Azure landing zone design.
In this blog post, we will learn how to automate Azure workloads with Ansible. We will do the end-to-end automation for Azure virtual machine.
In this post, I have provided an important useful configurable script to tag multiple Azure resources or Resource Groups.
In this blog, we will learn how to deploy the SQL server Always on Availability group on Azure Kubernetes Services.
In this blog, we will learn how to deploy the SQL server container on Azure Kubernetes services with High availability. We will use the persistent storage feature of Kubernetes to add resiliency to the solution. In this scenario, if the SQL server instance fails, Kubernetes will automatically re-create it in a new POD and attach it to the persistent volume. It will also provide protection from Node failure by recreating it again. If you are new to Kubernetes we will start by understanding the basic terminology of Kubernetes and its Architecture.
Suppose you built a large environment in Azure with more than 1000 Virtual machines. Now we need to provide the Virtual Machine details to the customer(or raise the SNOW ticket) and it is very difficult to collect each VM detail manually from Azure Portal. Also, there can be another use case if you want to verify the VMs to compare with each other to ensure all the VMs are created the same way. For example, the Cache setting for all the VMs should be Read /Write. You may also want to grab details of all the data disks and OS disks and their size, name info, and cache settings. This script grabs all the info in one shot and exports it into a CSV file for further manipulation.Let’s dive in.
The default installation of databricks creates its own Virtual network and you do not have any control over it. But If you want to deploy Databricks into your own private network due to security reasons. So this blog is for you. We will learn how to deploy Databricks into its own Private VNet. Let’s dive in.
In this blog we will learn about Azure container services and how to deploy SQL server 2019 on Azure Container Services.
Azure Synapse (Azure SQL Data Warehouse) is a massively parallel processing (MPP) database system. The data within each synapse instance is spread across 60 underlying databases. These 60 databases are referred to as “distributions”. As the data is distributed, there is a need to organize the data in a way that makes querying faster and more efficient.In this blog we will learn how to choose the right distribution strategy.
In this blog, we will discuss a real-time scenario of deploying software on multiple Linux and Windows Virtual machines simultaneously. Suppose you have 500 virtual machines both Windows and Linux and you want to push the software to these virtual machines. Obviously, it is not a workable solution to perform the manual installation on these 500 VMs. But there is good news that Azure provides a custom script extension for remote command execution. Let’s learn how to use it?
In this blog, I will share the script to retrieve the Azure resources inside the Azure subscription. This script iterates through each resource group inside an Azure subscription and retrieves each resource name resource type and resource tag and dumps the information inside a CSV file. So let’s dive in.
This blog discusses the step by step approach to mount the storage account to Azure Databricks.
In this blog post, we will learn about VNet Peering, Hub, and spoke Architecture and Service chaining in Azure.
In this blog, we will learn how Azure manages network traffic by using system routes and user-defined routes. Let’s dive in.
In this blog, we will learn how to set up and configure the Azure load balancer in the quickest possible way and test some Azure features. We will develop an Azure CLI script for the same. I have also created a video to showcase the Azure Load balancer functionality.
In this blog, we will go thru the step-by-step instructions to host Python Flask APIs in the Apache Web server.
In my earlier post, I provided step-by-step instructions to host a website on an Apache web server and secure it thru HTTPS. I have found that the site works perfectly fine in IE and Chrome but it throws a certificate error while browsing it thru Mozilla. It throws the SEC_ERROR_UNKNOWN_ISSUER error. So in this blog post we will explore the solution of this problem.
In this post, we will go thru step by step instructions for Apache Web server installation. After Apache Server Installation we will create an SSL certificate creation request to generate the certificates from Certificate Authority and then deploy the SSL certificates on Apache Web Server. We will also learn how to modify browser settings to make the certificate works in case the site is accessed from outside the corporate intranet where Root certificates are not installed on the machine.
A common concern with resources provisioned in Azure is that the ease with which they can be deleted. A careless administrator can accidentally erase months of work with a few wrong clicks. Azure Resource manager locks can help here. Let’s learn how?
Power BI service allows connectivity thru PowerBI Gateway in case you do not want to expose the on-premise data sources. Power BI Gateway can be installed on a server /VM deployed in the on-premise environment. Now If you deploy the Enterprise gateway in the On-Premise network your network team may not be happy and they will not open the firewall to expose the Enterprise Gateway to connect to the Internet. But do not worry and here is the good news, in order for the Enterprise gateway to function properly it requires certain ports to be open. Let’s learn how to configure the environment so it is secure.
I have published my last blog to describe to PowerShell script to register the App in the Azure AD, In this blog, we will discuss the PowerShell script to assign the necessary permissions for the App.
Recently I came across a situation where I was supposed to register an App in Azure Ad for multiple Environments, I felt it to be very cumbersome to do it using the Azure UI interface so I thought to create a script for it.
Recently I have implemented the Reader Writer cube scenario with Power BI reporting. In this blog, we will discuss it’s implementation.
In this blog, we will learn how to create Databricks Azure Key Vault-backed secret scope. So let’s dive in.
In this article, we will learn how to create a Databricks-backed secret scope. So let’s dive in.
In this blog, we will learn some useful Databarics CLI commands, tips, and tricks.
Databricks is a version of the popular open-source Apache Spark analytics and data processing engine. Azure Databricks is the fully managed version of Databricks and is a premium offering on Azure, that brings you an enterprise-grade and secure cloud-based Big Data and Machine Learning platform.
Data can be ingested in a variety of ways into Azure Databricks. For real-time Machine learning projects, you can ingest data through a wide range of technologies including Kafka, Event Hubs or ,IoT Hubs. In addition, you can ingest batches of data using Azure Data Factory from a variety of data stores including Azure Blob Storage, Azure Data Lake Storage, Azure Cosmos DB, or Azure SQL Data Warehouse which can then be used in the Spark-based engine within Databricks.
In this article, we are going to connect the data bricks to Azure Data Lakes.
I have recently come across a typical situation where I completely forgot the Remote Desktop manager’s password. I thought to recover the password by calling the help desk but I knew it is going to take so much time to get it back because of many regulatory and compliance issues. So I found a quick alternative. I wrote a power shell script to decrypt the lost password. Here are the steps: