Why do I need database?
• The primary goal of a DBMS is to provide an environment that is both convenient
and efficient for people to use in retrieving and storing information.
What is database?
• A database-management system (DBMS) consists of a collection of interrelated
data and a collection of programs to access that data. The data describe
one particular enterprise.
History of database / Purpose of database
Keeping organizational information in a file-processing system has a number of major disadvantages:
• Data redundancy and inconsistency.
• Difficulty in accessing data.
• Data isolation.
• Integrity problems.
• Atomicity problems.
• Concurrent-access anomalies.
• Security problems.
These difficulties, among others, prompted the development of database systems.
In what follows, we shall see the concepts and algorithms that enable
database systems to solve the problems with file-processing systems.
Database는 programming을 이용해 데이터의 create, read, update, delete를 자동화함으로써 데이터의 관리 측면에서도 큰 효율을 얻었다.
참고: http://avant.org/project/history-of-databases/
교재: Database System Concept 저자: Silberschatz, Korth, Sudarshan 출판사: McGraw-Hill
Review of this book:
Intended for a first course in databases at the junior or senior undergraduate, or first-year graduate, level, this book covers concepts and algorithms based on those used in commercial or experimental database systems. It aims to present these concepts and algorithms in a general setting that is not tied to one particular database system.
Other good textbooks on Database Systmes
"Database Management Systems, Third Edition”, by Ramakrishnan and Gehrke
“Database Systems: The Complete Book”, by Garcia-Molina, Ullman, and Widom
Goals of database course
Our primary focus is the design of the database with considering aspects of software design(UML)
Features of the Book
In most of this book, we use a university organization as a running example of a typical
data-processing application.
Each chapter has a list of review terms, in addition to a summary, which can help
readers review key topics covered in the chapter.
The exercises are divided into two sets: practice exercises and exercises. The
solutions for the practice exercises are publicly available on the Web site of the
book. Students are encouraged to solve the practice exercises on their own, and
later use the solutions on the Web site to check their own solutions. Solutions
to the other exercises are available only to instructors (see “Instructor’s Note,”
below, for information on how to get the solutions).
Many chapters have a tools section at the end of the chapter that provides
information on software tools related to the topic of the chapter; some of these
tools can be used for laboratory exercises. SQL DDL and sample data for the
university database and other relations used in the exercises are available on the
Web site of the book, and can be used for laboratory exercises.
Steps in designing database
The database design process can be divided into six steps. The ER model is most relevant to the first three steps.
Our primary focus is the design of the database, however,
and we will not discuss other aspects of software design(UML) in any detail.
There are many approaches to end-to-end software system design, covering all
the steps from identifying the business requirements to the final specifications
for a complete application, including workflow, user interfaces, and many aspects
of software systems that go well beyond databases and the data stored in
them. In this section, we briefly discuss an approach that is becoming popular,
called the unified modeling language (UML) approach.
UML, like the ER model, has the attractive feature that its constructs can be
drawn as diagrams. It encompasses a broader spectrum of the software design
process than the ER model:
(Omitted)
The database design process can be divided into six steps.
The ER model is most relevant to the first three steps:
1. Requirements Analysis
The very first step in designing a database
application is to understand what data is to be stored in the database,
what applications must be built on top of it, and what operations are
most frequent and subject to performance requirements. In other words,
we must find out what the users want from the database. This is usually
an informal process that involves discussions with user groups, a study
of the current operating environment and how it is expected to change,
analysis of any available documentation on existing applications that are
expected to be replaced or complemented by the database, and so On.Several methodologies have been proposed for organizing and presenting
the information gathered in this step, and some automated tools have been
developed to support this process.
as well as third-party vendors. For example! see the following link for
details on design and analysis tools from Sybase:
http://www.sybase.com/products/application_tools
The following provides details on Oracle's tools:
http://www.oracle.com/tools
2. Conceptual Design
The information gathered in the requirements
analysis step is used to develop a high-level description of the data
to be stored in the database, along with the constraints known to hold over
this data. This step is often carried out using the ER model and is discussed
in the rest of this chapter. The ER model is one of several high-level,
or semantic, data models used in database design. The goal is to create
a simple description of the data that closely matches how users and developers
think of the data (and the people and processes to be represented in
the data). This facilitates discussion among all the people involved in the
design process, even those who have no technical background. At the same
time, the initial design must be sufficiently precise to enable a straightforward
translation into a data model supported by a commercial database
system (which, in practice, means the relational model).
3. Logical Database Design
We must choose a DBMS to implement
our databctse design, and convert the conceptual database design into a
database schema in the data model of the chosen DBMS. We will consider
only relational DBMSs, and therefore, the task in the logical design step
is to convert an ER schema into a relational database schema. We discuss
this step in detail in Chapter 3; the result is a conceptual schema,
sometimes called the logical schema, in the relational data model.
4. Schema Refinement
The fourth step ill database design is to analyze
the collection of relations in our relational database schema to identify potential
problems, and to refine it. In contrast to the requirements analysis
and conceptual design steps, which are essentially subjective, schema refinement
can be guided by some elegant and powerful theory. \Ve discuss
the theory of normalizing relations-restructuring them to ensure some
desirable properties-in Chapter 19.
5. Physical Database Design
In this step, we consider typical expected
workloads that our database must support and further refine the database
design to ensure that it meets desired performance criteria. This step may
simply involve building indexes on some tables and clustering some tables,
or it may involve a substantial redesign of parts of the database schema
obtained from the earlier design steps. We discuss physical design and
database tuning in Chapter 20.
6. Application and Securiy Design
Any software project that involves
a DBMS must consider aspects of the application that go beyond the
database itself. Design methodologies like UML (Section 2.7) try to address
the complete software design and development cycle. Briefly, we must
identify the entities (e.g., users, user groups, departments) and processes
involved in the application. We must describe the role of each entity in every
process that is reflected in some application task, as part of a complete
workflow for that task. For each role, we must identify the parts of the
database that must be accessible and the parts of the database that must
not be accessible, and we must take steps to ensure that these access rules
are enforced. A DBMS provides several mechanisms to assist in this step,
and we discuss this in Chapter 21.
There are many kinds of diagrams in UML. Use case diagrams describe the
actions performed by the system in response to user requests, and the people
involved in these actions. These diagrams specify the external functionality
that the system is expected to support.
In the implementation phase, we must code each task in an application language
(e.g., Java), using the DBMS to access data. We discuss application development in Chapters 6 and 7.
In general, our division of the design process into steps should be seen as a
classification of the kinds of steps involved in design. Realistically, although
we might begin with the six step process outlined here, a complete database
design will probably require a subsequent tuning phase in which all six kinds
of design steps are interleaved and repeated until the design is satisfactory.
(Database Management Systems - 3rd Edition, 2.Introduction to database design, p26)
Contents in Context of Six steps above
PART 1.
2. INTRODUCTION TO DATABASE DESIGN
ER model used to create an initial design
Relationship between UML and ER model
3. THE RELATIONAL MODEL
How is data represented in the relational model(most used model in DBMS, but it's not only a model in DBMS)?
How can we create, modify, and query tables using SQL?
How do we obtain a relational database design from an ER diagram?
What is the foundation for relational query languages like SQL?
What is the difference between procedural and declarative languages?
4. RELATIONAL ALGEBRA AND CALCULUS
What is relational algebra, and why is it important?
What subset of mathematical logic is used in relational calculus, and how is it used to write queries?
5. SQL: QUERIES, CONSTRAINTS, TRIGGERS
How are queries expressed in SQL? How is the meaning of a query specified in the SQL standard?
How does SQL build on and extend relational algebra and calculus?
Part 2.
6. DATABASE APPLICATION DEVELOPMENT
How do application programs connect to a DBMS?
How can applications manipulate data retrieved from a DBMS?
How can applications modify data in a DBMS?
What is JDBC, SQLJ and how is it used?
What are stored procedures?
7. INTERNET APPLICATIONS
How do we name resources on the Internet?
How do Web browsers and webservers communicate?
How do we present documents on the Internet? How do we differentiate between formatting and content?
What is a three-tier application architecture? How do we write threetiered applications?
Why do we have application servers?
PART 3.
8. OVERVIEW OF STORAGE AND INDEXING
How does a DBMS store and access persistent data?
How does a DBMS organize files of data records on disk to minimize I/O costs?
What is the relationship between a file of data records and any indexes on this file of records?
9. STORING DATA: DISKS AND FILES
What are the different kinds of memory in a computer system?
What are the physical characteristics of disks and tapes, and how do they affect the design of database systems?
How does a DBMS keep track of space on disks? How does a DBMS access and modify data on disks?
What is the significance of pages as a unit of storage and transfer?
How does a DBMS create and maintain files of records?
How are records arranged on pages, and how are pages organized within a file?
10. TREE-STRUCTURED INDEXING
11.HASH-BASED INDEXING
PART 4.
12. OVERVIEW OF QUERY EVALUATION
What descriptive information does a DBMS store in its catalog?
What alternatives are considered for retrieving rows from a table?
Why does a DBMS implement several algorithms for each algebra operation?
What factors affect the relative performance of different
algorithms?
What are query evaluation plans and how are they represented?
Why is it important to find a good evaluation plan for a query? How is this done in a relational DBMS?
13. EXTERNAL SORTING
Why is sorting important in a DBMS?
Why is sorting data on disk different from sorting in-memory data?
How does external merge-sort work?
How do techniques like blockecl I/O and overlapped I/O affect the design of external sorting algorithms?
When can we use a B+ tree to retrieve records in sorted order?
14. EVALUATING RELATIONAL OPERATORS
What are the alternative algorithms for selection?
Which alternatives are best under different conditions? How are complex selection conditions handled?
How can we eliminate duplicates in projection? How do sorting and hashing approaches -compare?
What are the alternative join evaluation algorithms? Which alternatives are best under different conditions?
How are the set operations (union, inter;section, set-difference, crossproduct) implemented?
How are aggregate operations and grouping handled?
How does the size of the buffer pool and the buffer replacement policy affect algorithms for evaluating relational operators?
15. A TYPICAL RELATIONAL QUERY OPTIMIZER
How are SQL queries translated into relational algebra?
As a consequence, what class of relation algebra queries does a query optimizer concentrate on?
What information is stored in the system catalog of a DBMS and how is it used in query optimization?
How does an optimizer estimate the cost of a query evaluation plan?
How does an optimizer generate alternative plans for a query?
What is the space of plans considered? What is the role of relational algebra equivalences in generating plans?
How are nested SQL queries optimized?
16. OVERVIEW OF TRANSACTION MANAGEMENT
What four properties of transactions does a DBMS guarantee?
Why does a DBMS interleave transactions?
What is the correctness criterion for interleaved execution?
What kinds of anomalies can interleaving transactions cause?
How does a DBMS use locks to ensure correct interleavings?
What is the impact of locking on performance?
What SQL commands allow programmers to select transaction characteristics and reduce locking overhead?
How does a DBMS guarantee transaction atomicity and recoveryfrom system crashes?
17. CONCURRENCY CONTROL
How does Strict 2PL ensure serializability and recoverability?
How are locks implemented in a DBMS?
What are lock conversions and why are they important?
How does a DBMSresolve deadlocks?
How do current systerns deal with the phantom problerrl?
Why are specialized locking techniques used on tree indexes?
How does multiple-granularity locking work?
What is Optimistic concurrency control?
What is Timestarrlp-Ba..')ed concurrency control?
What is Multiversion concurrency control?
18. CRASH RECOVERY
What steps are taken in the ARIES method to recover fronl a DBMS crash?
How is the log rnaintained during nonnal operation?
How is the log used to recover frorn a crash?
What infonnation in addition to the log is used during recovery?
What is a checkpoint and why is it used?
What happens if repeated crashes occur during recovery?
How is media failure handled?
How does the recovery algorithnl interact with concurrency control?
PART 5.
19. SCHEMA REFINEMENT AND NORMAL FORMS
What problems are caused by redundantly storing information?
What are functional dependencies?
What are nornlal forms and what is their purpose?
What are the benefits of BCNF and 8NF?
What are the considerations in decolllposing relations into appropriate normal forms?
Where does normalization fit in the process of database design?
Are luore general dependencies useful in database design?
20. PHYSICAL DATABASE DESIGN AND TUNING
What is physical database design?
What is a query workload?
How do we choose indexes? What tools are available?
What is co-clustering and how is it used?
What are the choices in tuning a database?
How do we tune queries and view?
What is the impact of concurrency on perforrnance?
How can we reduce lock contention and hotspots?
That are popular database benchnlarks and how are they used?
21. SECURIY AND AUTHORIZATION
What are the rnain security considerations in designing a database application?
What IIlechanisms does a DBNIS provide to control a user's access to data?
What is discretionary access control and how is it supported in SQL?
What are the weaknesses of discretionary access control?
How are these addressed in lnandatory access control?
What are covert channels and how do they cornpromise lnandatory access control?
What must the DBA do to ensure security?
What is the added security threat when a database is accessed remotely?
What is the role of encryption in ensuring secure access?
How is it used for certifying servers and creating digital sig11atures?
PART 6.
22. PARALLEL AND DISTRIBUTED DATABASES
23. OBJECT-DATABASE SYSTEMS
24. DEDUCTIVE DATABASES
25. DATA WAREHOUSING AND DECISION SUPPORT
26. DATA MINING
27. INFORMATION RETRIEVAL AND XML DATA
28. SPATIAL DATA MANAGEMENT
29. FURTHER READING
30. THE MINIBASE SOFTWARE
10 Things you absolutiely need to do
There are 10 essentials:
1. Understand the database’s purpose.
Despite the popularity of Scrum, rapid application development (RAD), and numerous other rapid-development techniques, you still need to know why you’re creating the database. The first essential is that you discover and understand the database’s purpose before doing anything else. Saying that the database’s purpose is “to store records” isn’t good enough. You need to understand the business reason for this database’s existence before it goes into development, much less production. The business reason needs to map not only to the technical specifications but also
to the business purpose and the business processes that this database will be supporting. For example, will it be a retail point of sale (POS) database, an HR database, or part of a customer relationship management (CRM) package that will help you track customers and manage the sales cycle? You need to know this and many more details about why you’re creating the database before you start the design process.
2. Get the right tool.
You can’t create a viable database without software tools, no more than you could build a house without construction tools. So, you’re going to need data modeling software. Data modeling software, also called a CASE tool, is the data modeler’s and data designer’s best friend. A good CASE tool can help you quickly create different types of data models—such as functional decompositions, system and context diagrams, business process and data flow diagrams, and entity relationship models—each of which will help you communicate to your colleagues the visual components of the proposed system you’re building. Some CASE tools have features that enable sharing, coordination, merging, and version control for design teams. Last, but certainly not least, these tools effectively document what you’re doing and, by implication, why you’re doing it.
The cost of entry into the CASE tool market is not insignificant. However, the initial investment will be paid back in terms of shortened time-to-market for database projects and increased knowledge of corporate data and processes. You can read about six different CASE tools in "Comparative Review: Sizing Up Data Modeling Software". The comparison is based on capability and price, with some indication of how long it might take you to learn the package.
There is one thing to remember, though: No modeling tool will ever replace a person who understands how the business works. The CASE tool will only assist in creating visual representations of the business processes and structure.
3. Gather the requirements for the database.
Once you understand the overarching reason why you’re doing the database project and you’ve selected a tool that will help you visualize the as-is and to-be environments, you need to do a deep dive into requirements gathering. You need to understand the requirements well enough to be able to create visual representations of data, processes, and systems.
When gathering the requirements, don’t limit yourself to disk drives, virtual machines (VMs), and other technical requirements. Although you need to know the technical requirements, they aren’t pivotal to good database design. The database that you’re creating has to support business operations, so you need to interview company staff members and managers to understand how the business runs. Only then can you get a handle on the requirements that you’ll need to meet in order to create a viable database.
When you’re gathering requirements, you’re going to encounter conflicting business needs—count on it. You’ll have to wear your diplomat hat to get the parties involved to agree on some sort of compromise so that the project can move forward.
4. Be conscientious when modeling the data.
My favorite part of database design is modeling the data—that is, creating structures that will hold distinctly different data sets (entities or tables) and representing the relationships between pairs of these data sets. When you’re modeling the data, you’ll have a chance to reaffirm or correct what you found when gathering the requirements.
There’s not enough space here to do a deep dive into the details of how to model data, so I’ll point out the highlights and some constructs that you shouldn’t avoid simply because they seem a little complicated.
Point 1. Based on the answer to why you’re building the database, you need to use either transactional modeling techniques (for an OLTP database) or dimensional modeling techniques (for a relational data warehouse). In the SQL Server Magazine archive, you can find a wealth of information about how to do both types of data modeling. The Learning Path box lists some of those articles.
Point 2. You need to perform data modeling at multiple levels. You should create the following three models:
- Conceptual data model (CDM). The CDM contains the semantics of the information domain that you’re modeling. (Taken from the Greek word semantiká, semantics refers to the study of meaning.) The CDM defines the scope of the database. Entity classes—the distinct data sets I mentioned earlier—represent things of interest to the organization, and the relationships between the entity pairs assert business rules and regulations. The CDM expresses both the as-is and to-be states and is often created during the Discovery stage.
- Logical data model (LDM). The LDM describes the structure of the information domain. It includes criteria such as attribute set details, data types, key status, and individual attribute nullability.
- Physical data model (PDM).The PDM defines how the data will be physically stored. You need to map the LDM to a specific database management system (DBMS) platform.
Point 3. This point is relevant for transactional data models only: You need to normalize the data and model to third normal form (3NF). Failure to model to 3NF in the early stages of design restricts your understanding of how data items interact with each other and most likely will inhibit a full understanding of the business requirements that the database must support. Once you move to later levels of design (e.g., the PDM), you can denormalize back to second normal form (2NF) if you must. Denormalizing a transactional data model isn’t recommended, however. It results in data duplication, which causes insert, update, and deletion anomalies and data synchronization problems. But if you must denormalize (for production efficiencies, perhaps), you’ll at least know exactly where the data duplication will occur and where to expect data synchronization problems. For more information about 2NF, 3NF, and data normalization, see "SQL By Design: Why You Need Database Normalization".
Point 4. Every transactional database contains four integrities: primary key integrity, referential integrity, domain integrity, and business rules integrity. Whether these integrities are maintained will determine if your data integrity will last beyond the first data load into the database. It will also determine whether the database can support the business rules and requirements that you gathered. For more information about the four integrities, see the Learning Path box.
Point 5. Never use a printed report as a template for a table. This is a common mistake that database design novices often make. Take, for example, an invoice. An invoice contains three distinct data sets:
- Sales data. This data set includes information such as the date, time, and location of the sale, and the total invoice amount (with or without discounts, taxes, and shipping and handling).
- Items sold. This data set includes information such as quantity, per-item cost, total cost of items, and references to items’ descriptions.
- Product in inventory. This data set includes a complete description of each product and other information that’s necessary to maintaining an inventory.
Yet, in production databases, I’ve seen tables named Invoice that mix components of all three data sets. This is detrimental to data integrity and confusing for the office staff. A printed report is a business requirement and one of the outcomes of your database design. You can create a view, indexed or otherwise, to model a business report, but a business report shouldn’t be used as a template for a database table.
Point 6. Don’t be afraid to include supertype and subtype entities in your design in the CDM and onward. The subtypes represent classifications or categories of the supertype, such as employees who are staff members and employees who are authors. They’re both employees, and they’re both entitled to the employee benefits package. Yet some staff members are authors and some aren’t; authors have additional relationships and attributes that the staff members don’t have. Entities are represented as subtypes when it takes more than a single word or phrase to categorize the entity.
If a category has a life of its own, with separate attributes that describe how the category looks and behaves and separate relationships with other entities, then it’s time to invoke the supertype/subtype structure. Failure to do so will inhibit a complete understanding of the data and the business rules that drive data collection.
When it comes to implementing the supertype/subtype structure, the way you’ve modeled the business requirements and the relationships will determine whether you create one or many tables in the PDM. In this example, if you create an Employee table and an Author table, you could be duplicating data for the individual who is both a staff member and an author. This duplication would lead to nonkey data redundancy within the database, which inevitably causes insert, update, and deletion anomalies, unsynchronized data, and loss of data integrity. For more information about how to properly use supertypes and subtypes, see "Implementing Supertypes and subtypes."
Point 7. You must look out for multivalued attributes (MVAs), both expressed and implied. In its latest rendering, Microsoft Access has introduced the MVA structure. In doing so, it has veered away from the principles of relational data modeling and has truly become a nonrelational database. I recently reviewed a real estate system for a business that wanted to upgrade from Access to SQL Server. The database schema was riddled with MVAs. For instance, the building entity had an array of attributes named Picture1, Picture2, Deed of Record, Title, and so on—all within the same table. The owner wasn’t too surprised when he was told that SQL Server wouldn’t support the MVAs because his programmers had been having problems trying to access the attributes that were stored as MVA structures.
An MVA contains a collection of values for an attribute, as shown by the list of disks (e.g., Disk1Capacity, Disk2Capacity) in the SERVER table in Figure 1. This figure shows the schema for a server room inventory database. The solution to resolving this MVA is to flatten the structure so that each disk has its own row in a newly created table named DISK_DEVICES, as Figure 2 shows. The only constraint on the number of disks that the server can have is the number of bays in the server.
MVAs make retrieving summary data extremely difficult. They also cause problems when inserting, updating, and deleting data in a database. You can read more about MVAs in "Multivalued Attributes."
5. Enforce relationships.
The whole idea of a relational database is to identify and enforce relationships between entity pairs so that data integrity, as defined in the business rules and regulations and represented in the data models, is maintained. If you have the rule "each order must be associated with a current customer," you don’t want someone to write a program that lets a person place an order but not get a customer number. No application-level code can circumvent the rule to maintain data integrity.
The three types of relationships—one-to-one (1:1), one-to-many (1:M), and many-to-many (M:N)—are easily transitioned from the data model to the implemented database. For information about how to do this, see "Logical Modeling."
6. Use the appropriate data types.
There are a myriad of reasons why you want to use the proper data type for an attribute. Reduced disk storage space, data accuracy, the ability to join or union two tables together, attribute synchronization with the application layer, and internationalization capability are just a few of the reasons why you want to make sure that you’re defining each attribute’s data type correctly.
Technically, using the correct data type for an attribute speaks to domain integrity, which is one of the four database integrities I mentioned previously. When enforced, domain integrity ensures that an attribute contains only the set of valid values that are relevant and meaningful to that attribute. Many data modeling software packages let you create data domains and use them throughout the model. For example, you can create a Zip+4 domain that’s defined as char(5)+'-'+char(4). As a result of using the Zip+4 domain, every zip code instance in the database is defined with the same data type, instead of some instances defined as char(), some as varchar(), and some as int. Adding a NOT NULL characteristic to the Zip+4 domain further enforces the domain integrity rule that states each occurrence of a zip code in the database must contain a value.
Implementing domain integrity in SQL Server is a manual effort if you don’t have a data modeling package that lets you create and enforce data domains. The tools at your disposal include
- Defining appropriate data types and lengths, nullability, defaults, and check constraints when creating tables
- Using user-defined types, user-defined functions, and stored procedures that evaluate data when it’s being entered or used
- Using Declarative Referential Integrity (DRI) or triggers to enforce relationships between pairs of tables
- Using lookup tables to restrict data values when they’re inputted or edited
For additional ideas on how to enforce domain integrity, see "SQL By Design: The Four Integrities."
7. Include indexes when modeling.
When you’re building the PDM, you should include indexes. This is a bit of a chicken-and-egg situation because until you know how the data is going to be used, you can’t predict with certainty which columns will need indexing. But you’ve already gathered the requirements, so you should be able to make an intelligent guess as to which data should be indexed and how to index it. SQL Server automatically indexes (and clusters, by default) primary key constraints, but it doesn’t automatically index foreign key constraints. Therefore, the first set of indexes you want to create are those for the foreign keys in child tables. The next set would be for those columns on which you’re going to sort. In other words, you want to create indexes for columns that would be in ORDER BY clauses. From the gathered requirements, you should be able to determine which columns will be used for filtering (i.e., in the WHERE clause), so based on your understanding of how many rows are in each table, you can make an intelligent decision regarding which columns to index. For additional information about indexing, see “Indexing Dos and Don'ts”.
8. Standardize the naming convention.
If you don’t already have one, adopt a naming convention for all database objects, including tables, columns, indexes, keys, named constraints and relationships, and code. Not having a naming convention can lead to errors in data retrieval and can even inhibit productivity. A database schema is rarely documented and evolves over time; an established naming convention for the objects helps modulate schema confusion. Here are a few rules for creating useful database object names:
- Use names that are meaningful to the entire organization and clearly identify the database object.
- Don’t get too wordy. Use the minimum number of characters necessary to identify the database object.
- Don’t use names that implicitly or explicitly refer to more than one topic (for tables) or characteristic (for columns).
- Use the singular form of a name wherever possible, especially for entities and tables. This helps to correctly distinguish the entity-to-entity relationship (i.e., 1:1, 1:M, or M:N).
- Don’t use spaces in database object names (e.g., Employee ID). Other database management systems might not support the use of spaces in object names, thereby limiting transportability and cross-database integration.
When you choose a naming convention, remember that no one convention is perfect, but almost any naming convention is better than none at all. For additional information about naming conventions, see “Seven Deadly Sins”.
9. Store the code that touches the data in SQL Server.
Repeat after me, “I will not embed dynamic SQL code in my applications.” Doing so puts a database at risk for inconsistent and possibly inaccurate application of business rules and regulations. It also puts a database at risk for a SQL injection attack. In case you haven’t heard the term SQL injection before, it’s a security vulnerability that occurs whenever one programming or scripting language (in this case, T-SQL) is embedded inside another. A hacker can literally hijack the embedded T-SQL code, modify it on the fly, and gain access to the data in the database. The technique to do this is widely known and publicly available. For additional information about SQL injection, see the web-exclusive article "SQL Injection Attacks on the Rise."
The best way to avoid inconsistent rule application and SQL injection is to store code that touches a database’s data as a stored procedure or CLR object. When you need to add or modify data, you then call the appropriate stored procedure; the operation will execute the same way every time it’s called. For an introduction to the CLR, see the web-exclusive article "What's the CLR and Should I care?"
10.Document the work.
Documenting the work you do is probably you're least favorite activity. But if you're following through with the essentials that I've discussed, you're already documenting. Now, that wasn't so bad, was it?
Reap the Benefits
If you did a good job discovering the database’s purpose and its requirements, and if you were conscientious while you worked through the myriad steps of the modeling process, then your database design will likely stand up to everything that the organization can throw at it. The database will support all the business requirements and comply with all the integrity, security, and privacy rules and regulations that govern your industry. It’ll perform well and be easy to use and maintain. And as the organization evolves into different lines of business, you’ll be able to easily extend the database’s functionality and easily scale it up and out to store and serve up many times more data than it was originally designed for
(Copied from: http://www.itprotoday.com/software-development/database-design-essentials)
'데이터베이스' 카테고리의 다른 글
데이터 무결성 (0) | 2021.07.14 |
---|---|
데이터베이스의 설계 단계 (0) | 2021.07.14 |
스키마 정의 및 종류 (0) | 2021.07.14 |
병렬 데이터베이스의 분할 기법 (0) | 2021.07.13 |
django - mariadb on AWS EC2 연동 (0) | 2021.06.06 |