Skip to main content
Data Engineering 1: SQL and Different Shapes of Data
Graduate Program (& Advanced Certificate) Status
Course Description

As of August 2023, SQL-based data stores are still dominating the landscape data persistency. Based on the popularity index, their market share is more than 70% (dbengines.com).

From an analyst perspective, SQL language enables data professionals to efficiently extract, wrangle and transfer data from the most popular data sources. This said the importance of having knowledge of SQL language and Relational Database Management Systems (RDBMS) is mandatory for every analyst working with data. Regarding the level of the course, knowledge beyond intermediate is usually required for DB Admins and DB Engineers, so this course is targeting to level up students from complete newbie to intermediate level.

The material presented is also a bootcamp for using MySQL. Being the most popular open-source Database Management System and the second one after Oracle DB on the overall charts, MySQL is a natural choice to introduce students to the world of Databases.

In the second part of the course, we deep dive into advanced data storage topics relevant to Data Analysts, we go beyond the tabular format and discover different shapes of data and the tools supporting these shapes. By the end of the course, we aim to build comprehensive analytical pipelines with new technologies.

 

Learning Outcomes

By end of the course, students should be able to:


- do exploratory analysis on SQL database by learning the main SQL commands
- install and configure MySQL on a basic level
- learn the usage of MySQL Workbench
- build simple data structures including techniques such as replication, dumping, or loading from external sources
- create an analytical data layer using data warehouse architecture
- create simple ETL jobs
- get to know the basics of NoSQL technologies
- acquire practical knowledge in some NoSQL solutions
- have a broad understanding of choosing the right technical solution within the 1000s of technologies
available these days
- understand the tradeoffs of different data architectures
- work with different data formats and files
- work with API as a data source
- model data structures
- build a data pipeline for analytics

Assessment

Grading will be based on a total score out of 100, in line with CEU’s standard grading guidelines.

  • Homework (10%)
  • Project1 (30%)
  • Project2 (30%)
  • Questionnaire as Exam (30%)
Prerequisites

Mathematics and Informatics Pre-session for Business Analytics

     TECHNICAL/LAPTOP REQUIREMENT

-          Personal laptop computer with administrative privileges to install open-source software.

-          Operating system: Windows 10 or Mac OS X

-          Latest Chrome browser

-          Internet access

-          KNIME Analytics Platform (https://www.knime.com/downloads/download-knime)

-          The latest community edition of MySQL and MySQL Workbench is preinstalled (https://dev.mysql.com/downloads/)

-          Make sure you remember the root password you set

File attachments
Course Syllabus (170.91 KB)
Course Level
Master’s
Academic Year
2023-2024
Term
Fall
US Credits
2
ECTS Credits
4
Course Code
ECBS5146