Data Pipelines
A data pipeline, sometimes referred to as an ETL pipeline, is a sequence of ETL jobs that work together to transforms data and information to be consumable by one or more data products.
Generally, each ETL in a data pipeline will extract data from one or more data sources, transform it for some particular purpose, then load it to a new data store. Subsequent ETLs will consume data from that store, then transform and load it to their own data store, and so on.
Deeper Knowledge on Data Pipelines
Apache Kafka
A distributed event streaming platform for data-pipelines and analytics
Data Products
Ways of making data available
Extract Transform Load (ETL)
Ways to extract, transform, and load data
Apache Spark
A data processing engine for batch processing, stream processing, and machine learning
Broader Topics Related to Data Pipelines
Data Products
Ways of making data available
Data Engineering
Engineering approaches to data management