Building an ETL Pipeline with Open Source Tools

What Is ETL

ETL stands for Extract, Transform, Load. Extraction is the process by which data from many sources and formats is collected. The data is then processed to allow for ease of storing and future processing. This can include data cleaning, or format normalization into file structures such as JSON. From here the data can then be persisted for storage and access by interested stakeholders.

Continue reading