The Data Scientist Certificate are for those users that are looking to implement Keboola into their Data Science practice/workflow.  

This certificate will be good for Data Scientists in order to learn the basics of Keboola Connection, how to prepare your data and/or create your feature store, and go on to also learn how to develop Data Science models and work with MLFlow to deploy your models. 

Existing knowledge and experience preparing data, and data science is useful for this course.  

The certificate requires completion of Introduction, Best Practices, Common Components and Processors, and Data Science.

If you have already completed any of the assignments, you will not be required to resubmit in order to qualify for completion of this course.

Course curriculum

  • 1

    Certificate Course Introduction

    • Certificate Course Introduction

  • 2

    Introduction

    • Architecture Overview

    • Extractors

    • Storage

    • Transformations and Workspaces

    • Transformations

    • Jobs

    • Writers

    • Orchestrations

    • Additional Transformations

    • Scaffolds

    • Support

    • Trash

    • Resources

  • 3

    Best Practices

    • Introduction and Architecture Explanation

    • Storage Part 1

    • Storage Part 2

    • Development Branches

    • Snowflake Transformation Tips

    • ** Note

    • Dependencies and Phases

    • Data Science Transformations

    • Variables

    • Shared Code

    • Workspaces

    • Input & Output Mapping Incremental

    • Orchestrations

    • Assignment Overview

    • Assignment

    • Resources

  • 4

    Common Components and Processors

    • Intro and Components Overview

    • Intro - Public vs Private Components

    • Intro - Component Developers

    • Intro - Component Configurations

    • Processors

    • Processors - Processor Example

    • Processors - S3 Processors Deep Dive

    • Common Components - FTP, Email, HTTP

    • Common Components - KBC, Geocoder, Apify, Selenium

    • Common Components - Generic Extractor

    • Common Components - Text Analytics

    • Common Components - Mailgun

    • Common Components - Common Processors

    • Assignment - Assignment Overview

    • Assignment

    • Assignment - S3 Intro

    • Assignment - How to Perform a Debug Job via API

    • Resources

  • 5

    Data Science

    • Introduction - Shared Project

    • Introduction

    • Workspaces

    • Workspaces Demo - Introduction

    • Workspaces Demo - Creating a Workspace

    • Workspaces Demo - Loading Data and Connecting to a Workspace

    • Workspaces Demo - Additional Features

    • Experiments and Development - Workflow

    • Workspaces Demo - JupyterLab Tour

    • Experiments and Development - MLFlow

    • MLFlow - Running Experiments

    • MLFlow - Register a Model

    • MLFlow - Deploy and Use a Model

    • Assignment - Assignment Overview

    • Assignment

    • Presentation

  • 6

    Certificate Submission

    • Certificate Submission