The Azure Data Architecture Guide:
The guide is structured around a basic pivot: The distinction between relational data and non-relational data.
Relational data is generally stored in a traditional RDBMS or a data warehouse. It has a pre-defined schema (“schema on write”) with a set of constraints to maintain referential integrity. Most relational databases use Structured Query Language (SQL) for querying. Solutions that use relational databases include online transaction processing (OLTP) and online analytical processing (OLAP).
Non-relational data is any data that does not use the relational model found in traditional RDBMS systems. This may include key-value data, JSON data, graph data, time series data, and other data types. The term NoSQL refers to databases that are designed to hold various types of non-relational data. However, the term is not entirely accurate, because many non-relational data stores support SQL compatible queries. Non-relational data and NoSQL databases often come up in discussions of big data solutions. A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional database systems.
Within each of these two main categories, the Data Architecture Guide contains the following sections:
- Concepts. Overview articles that introduce the main concepts you need to understand when working with this type of data.
- Scenarios. A representative set of data scenarios, including a discussion of the relevant Azure services and the appropriate architecture for the scenario.
- Technology choices. Detailed comparisons of various data technologies available on Azure, including open source options. Within each category, we describe the key selection criteria and a capability matrix, to help you choose the right technology for your scenario.
This guide is not intended to teach you data science or database theory — you can find entire books on those subjects. Instead, the goal is to help you select the right data architecture or data pipeline for your scenario, and then select the Azure services and technologies that best fit your requirements. If you already have an architecture in mind, you can skip directly to the technology choices.
Traditional RDBMS
Concepts
Scenarios
- Online analytical processing (OLAP)
- Online transaction processing (OLTP)
- Data warehousing and data marts
- ETL
Big data and NoSQL
Concepts
- Non-relational data stores
- Working with CSV and JSON files
- Big data architectures
- Advanced analytics
- Machine learning at scale
Scenarios
- Batch processing
- Real time processing
- Free-form text search
- Interactive data exploration
- Natural language processing
- Time series solutions
Cross-cutting concerns