As of August 2023, SQL-based data stores are still dominating the landscape data persistency. Based on the popularity index, their market share is more than 70% (dbengines.com).
From an analyst perspective, SQL language enables data professionals to efficiently extract, wrangle and transfer data from the most popular data sources. This said the importance of having knowledge of SQL language and Relational Database Management Systems (RDBMS) is mandatory for every analyst working with data. Regarding the level of the course, knowledge beyond intermediate is usually required for DB Admins and DB Engineers, so this course is targeting to level up students from complete newbie to intermediate level.
The material presented is also a bootcamp for using MySQL. Being the most popular open-source Database Management System and the second one after Oracle DB on the overall charts, MySQL is a natural choice to introduce students to the world of Databases.
In the second part of the course, we deep dive into advanced data storage topics relevant to Data Analysts, we go beyond the tabular format and discover different shapes of data and the tools supporting these shapes. By the end of the course, we aim to build comprehensive analytical pipelines with new technologies.
By end of the course, students should be able to:
- do exploratory analysis on SQL database by learning the main SQL commands
- install and configure MySQL on a basic level
- learn the usage of MySQL Workbench
- build simple data structures including techniques such as replication, dumping, or loading from external sources
- create an analytical data layer using data warehouse architecture
- create simple ETL jobs
- get to know the basics of NoSQL technologies
- acquire practical knowledge in some NoSQL solutions
- have a broad understanding of choosing the right technical solution within the 1000s of technologies
available these days
- understand the tradeoffs of different data architectures
- work with different data formats and files
- work with API as a data source
- model data structures
- build a data pipeline for analytics
Grading will be based on a total score out of 100, in line with CEU’s standard grading guidelines.
- Homework (10%)
- Project1 (30%)
- Project2 (30%)
- Questionnaire as Exam (30%)
- Personal laptop computer with administrative privileges to install open-source software.
- Operating system: Windows 10 or Mac OS X
- Latest Chrome browser
- Internet access
- KNIME Analytics Platform (https://www.knime.com/downloads/download-knime)
- The latest community edition of MySQL and MySQL Workbench is preinstalled (https://dev.mysql.com/downloads/)
- Make sure you remember the root password you set