Friday, February 24, 2023

CST363 Week 8

Briefly describe the what you consider to be the three (3) most important things about this course.

SQL: Structured Query Language is an extremely important part of interfacing with a relational database system. By using SQL and employing the various intricacies of its syntax, (such as SELECT, JOIN, WITH, HAVING, etc.), one can execute complex queries on a database efficiently and safely. Personally, my first few days learning SQL took some effort. However, after using MongoDB's map() and reduce() functions, I can say I much prefer writing the equivalent SQL statements.

ERD: Entity relationship diagrams help visualize and design the schema of a relational database. Tables, comprised of their column names, are usually depicted as connected by arrows, commonly drawn in crows-foot notation. These connections visualize the primary and foreign keys of the tables, and how they relate the entities of the database to each other. Without entity relationship diagrams, designing and comprehending a relational database would be significantly more difficult.

MongoDB: A NoSQL database implementation. Unlike relational databases, NoSQL systems have flexible schemas. This flexibility allows NoSQL systems to represent non-uniform data, which is critical in storing many common dataset types. Additionally, NoSQL systems tend to be optimized for horizontal scaling, which, in most cases, allows them to handle greater amounts of data than a comparable relational system. SQL and NoSQL databases each have their own respective benefits and drawbacks; projects need to individually determine which system best suits their end goals.

Tuesday, February 21, 2023

CST363 Week 7

In your own words what is a data warehouse and why is it important to have another copy of the data in a data warehouse?

A data warehouse is a database that copies and aggregates data from one or more operational databases to facilitate analytical operations. For example, an international electronics company with many regional branches might aggregate its local sales figures into one data warehouse to gain insights into the global operation.

It is important to maintain a separate copy of the source data for many reasons — some examples include:

  • Preventing analytical operations from impacting the function of the operational database(s).
  • Allowing the data to be transformed during the ETL process to further facilitate the desired analysis.
  • Storing a primarily read-only dataset to keep a historical archive of important records.