본문 바로가기

프로그래밍/MongoDB

M320 Data Modeling

728x90
반응형

아래 링크에서 무료 강의를 들으실수 있습니다.

https://university.mongodb.com/courses/M320/about

 

Data Modeling | M320 | MongoDB University

After completing this course, you should have a good understanding of how to create data models for MongoDB. We will go over a few techniques, from a very simple process for simple schemas to more complex ones for large teams and large projects. Prerequisi

university.mongodb.com

What You'll Learn

  • After completing this course, you should have a good understanding of how to create data models for MongoDB.
  • We will go over a few techniques, from a very simple process for simple schemas to more complex ones for large teams and large projects.

Agenda

  • Chapter 1: Introduction to Data Modeling
  • Chapter 2: Relationships
  • Chapter 3: Patterns (Part 1)
  • Chapter 4: Patterns (Part 2)
  • Chapter 5: Conclusion

Chapter 1: Introduction to Data Modeling

Lecture: Introduction to Data Modeling
Lecture: Course Prerequisites

Here are some of the terms and references for your benefit:

MongoDB Concepts and Vocabulary

Relational Database Concepts and Vocabulary

General Database Concepts and Definitions

MongoDB Compass and Atlas

Lecture: Data Modeling in MongoDB

  • usage pattern
  • how your access your data
  • which queries are critical to your application
  • ratios between reads and writes

Lecture: The Document Model in MongoDB

Lecture: Using Compass and Atlas

Download Compass Here

  • M320 Atlas Cluster Connection Information
    • Hostname : m320-96m7e.mongodb.net
    • SRV Record : ON
    • Authentication : Username / Password
    • Username : m320-student
    • Password : m320-password

Lecture: Using Compass to Analyze a Schema

  • Lab: Using Compass to Analyze a Schema

Lecture: Constraints in Data Modeling

  • Recap
    • the nature of your dataset and hardware define the need to model your data
    • It is important to identify those exact constrains and their impact to create a better model
    • As your software and the thechnological Iandscape change, your model be re-evaluated and updated accordingly
  • To know more about Transactions with MongoDB, please consult the MongoDB Documentation on Transactions and some videos explaining their implementation .

Lecture: The Data Modeling Methodology

  • Recap
    • Workload
      • data size, important reads and writes

    • Relationships

      • identify them, link or embed the related entities

    • Patterns
      • apply the ones for needed optimizations

Lecture: Model for Simplicity or Performance

  • Modeling for Simplicity Diagram

 

  • Modeling for Performance Diagram

  • Modeling for a Mix of Simplicity and Performance Diagram

  • Summary of Modeling Approaches

  • Recap
    • First of all, recognize the trade off between modeling for simplicity and performance.
    • Secondly, use the methodology in a flexible fashion.
    • Finally, regardless of how much you will model, you need to start by describing the workload of your project.

Lecture: Identifying the Workload

  • Recap
    • Quantify and Qualify the queries as much as you can
    • Few CRUD operations will drive the design
  • Lab: Identify and Quantify the Workload

Chapter 2: Relationships

Lecture: Introduction to Relationships
Lecture: Relationship Types and Cardinality

  • Recap
    • one-to-one, one-to-many, many-to-many are the usual cardinalities
    • one-to-zillions is useful in the Big Data World
    • even better, us "maximum" and "most likey" values using a tuple of the form: [min, likely, max]

Lecture: One-to-Many Relationship

One-to-Many: embed, in the "one" side

  • Recap 
    • There is a lot of choices : embed or reference and choose the side between "one" and "many"
    • Duplication may occur when embedding on the many side. However, it may be OK or even preferable.
    • Prefer embedding over referencing for simplicity, or when there is a small number of reference documents, as all related information is kept together.
    • Embed on the side of the most queried collection.
    • Prefer referencing when the associated documents are not always needed with the most often queried documents.

Lecture: Many-to-Many Relationship

  • Recap 
    • Ensure it is a :many-to-many" relationship that should not be simplified.
    • A "many-to-many" relationship can be replaced by two "one-to-many" relationship, but does not have to with the document model.
    • Prefer embedding on the most queried side
    • prefer embedding for information that is primarily static over time and may profit from duplication
    • prefer referencing over embedding to avoid managing duplication
  • Lab: Many-to-Many Relationship

Lecture: One-to-One Relationship

  • Recap
    • Prefer embedding over referencing for simplicity

    • Use subdocuments to organize the fields
    • Use a reference for optimization purposes
  • Lab: One-to-One Relationship

Lecture: One-to-Zillions Relationship

  • Crow's Foot Notation Definition
  • Crow's Foot Notation and ERD
  • Recap
    • it is a particular case of the one-to-many relationship.

    • The only available representation is to reference a document on the one side of the relationship from the zillion side.

    • Pay extra attention to queries and code that handles zillions of documents

Chapter 3: Patterns (Part 1)

Lecture: Introduction to Patterns
Lecture: Guide to Homework Validation

Lecture: Handling Duplication, Staleness and Integrity
Lecture: Handling Duplication
Lecture: Handling Staleness

Lecture: Handling Referential Integrity

  • Recap
    • Should or could the information be duplicated or not?
      •  resolve the duplication with bulk updates.
    • What is the tolerated or acceptable staleness?
      • Resolve with updates based on change streams.
    • Which pieces of data require referential integrity?
      • Resolve or prevent inconsistencies with change streams or transactions.

Lecture: Attribute Pattern

  • Summary
    • Orthogonal Pattern to Polymorphism
    • Add organization for:
      • common characteristics
      • rare/unpredicatble fields
    • Reduces number of indexes
    • Transpose keys/values as:
      • Array of sub-documents of form
  • Lecture Notes
    • With the release of the Wildcard Index functionality in MongoDB 4.2, some use cases of the Attribute Pattern can be replaced by this new index type.
    • To learn more about the Wildcard Index, please consult our documentation or watch the lessons in our MongoDB 4.2 online course, M042.
  • Lab: Apply the Attribute Pattern

Lecture: Extended Reference Pattern

 

  • Lecture Notes
    • Additional information on left outer join can be found here
    • At 1:50, we discuss a Many-to-One relationship. This is basically a One-to-Many relationship that we traverse from the Many side to the Oneside.

Lecture: Subset Pattern

  • Lab: Apply the Subset Pattern

Chapter 4: Patterns (Part 2)

Lecture: Computed Pattern

  • Summary
    • Avoid performing similar operations
  • Lab: Apply the Computed Pattern

Lecture: Bucket Pattern

  • Summary
    • Alternative to fully embedding or fully linking a 1-to-Many relationship
    • Advanced pattern that requires a good understand of the workload
  • Lab: Apply the Bucket Pattern

Lecture: Schema Versioning Pattern

  • Summary
    • Avoid downtime while performing schema upgrades
  • Lab: Apply the Schema Versioning Pattern

Lecture: Tree Patterns

  • Model Tree Structures
    • Parent References
    • Child References

  • Array of Ancestors
  • Materialized Paths

  • Recap 
    • Documents are good data structures to represent hierarchical data
    • Several different patterns to represent trees
    • Focus on the most common queries and operations to select the most effective tree pattern
  • Lecture Notes
    • Note that this pattern is different from most of the other patterns, as it offers four different variants for its implementation.
    • In the video, we use a check mark to emphasis that the pattern does support the query.
    • When we use an exclamation mark, we want to denote that it is possible to support this query, however, it may take more work in the application code, such as the use of multiple queries.
    • Documentation page on Model Tree Structures
  • Lab: Tree Patterns

Lecture: Polymorphic Pattern

  • Summary
    • Basic Pattern
    • Base of many other Patterns
  • Lab: Apply the Polymorphic Pattern

Lecture: Other Patterns

  • Approximation Pattern

 

  • Outlier Pattern

  • Recap: 
    • These are some other notable patterns
    • Approximation pattern
      • avoiding performing an operation too often
    • Outlier pattern 
      • keeping the focus on the most frequent use cases

Lecture: Summary of Patterns

  • Chapter Summary

    • Patterns are Powerful Transformation for your Schema
    • Provide a common Language for the team.
    • More Predictable Methodology.

Chapter 5: Conclusion

Epilogue

Congratulations on completing M320: Data Modeling!

The knowledge you have acquired will help you create more robust data models and efficient queries using MongoDB.

There are many more advanced topics and additional subjects we did not explore in this course. If you want to begin learning more about these, consult the following list of resources.

Sharding

Sharding is an extremely important topic for large-scale systems that will impact your design decisions. Most systems do not reach sizes that require Sharding. However if your system is already sharded or you are sure that your system will need to be, you should get familiar with the main concepts of Sharding. Here are some important reads on Sharding:

Query Effectiveness

We taught you to think early about your queries and to model based on your system's workload.

Once you've implemented your schema design, how do you assess the effectiveness of your queries? Consult the following resources to validate that your queries are working as expected, using the right indexes, and are not running too slowly:

Document and Schema Validation

We mentioned that although MongoDB uses a flexible schema, you can still enforce constraints on your data models. You can add many different kinds of validation, such as field type, value, and presence.

To know more about Schema Validation, please refer to the following resources:

Transactions in MongoDB

We mentionned a few times that MongoDB now supports transactions. To know more about them, please refer to the following resources:

Schema Design Patterns

To see additional information on our Schema Design Patterns, please refer to the following resources:

728x90
반응형