|
Files and database topics
|
> > |
|
|
File processing
|
> > |
- File structures
- Record
- Fixed
- Variable length
- Field
- File access
- performance issues
- Index
- Primary
- Secondary
- Clustering
- Indexing techniques
- B-trees
- VSAM
- Dynamic, multilevel indexes
- Hashing
- Indexing challenges
- Files with dense index
- Files with variable length records
- Storage requirements for different atomic data types
|
|
- I/O operations
- physical and logical files
- buffer management
|
< < |
- file structures
- file access
- performance issues
- indexed files
- B-trees
- indexed sequential access
- B+ trees
- VSAM files
- hashing
|
> > |
|
|
Database systems
- Database vs. flat files
- Data independence
|
|
|
< < |
|
|
- Database architecture
- Types of databases
|
< < |
-
- Relational databases (reference)
|
> > |
|
|
-
- Object-oriented databases
- Rule-based databases
|
< < |
-
- Multimedia databases (reference)
- Textual databases
|
> > |
|
|
|
< < |
Data and information modeling
- Types of data models: conceptual, logical, physical
- Scalibility
- Conceptual models
- Entity-relationship models
- Enhanced entity-relationship models
- Semantic data models
- Object-oriented data models
- Analysis of data requirements
- Identification of business rules
- Logical data models (including relational and object-relational data model)
- Physical data models
- Traditional data models: hierarchical, network
- Data models for data integration (data warehousing, data marts)
- Dimensional models
- Star schema
- Standardized modeling
|
> > |
- Data and information modeling
- Data model
- Conceptual data model / Semantic data model
- Entity-relationship model
- entity type
- relationship type
- attribute type
- Enhanced entity-relationship model
- Object-oriented data model
- Specific modeling languages
|
|
|
< < |
|
|
-
- Patterns and standard models
|
< < |
- CASE tools in data modeling
- Meta-modeling
|
> > |
-
-
-
- CASE Tools in data modeling
- Analysis of data requirements
- Identification of business rules
|
|
|
< < |
- Schema Architecture
- fact based
- relational
- hierarchical
- network
Relational databases
- Mapping conceptual schema to a relational schema
- Representing relationships
- Integrity
- Referential
- Data item
- Entity
- Intra relation???
- Relational algebra and relational calculus
|
> > |
-
-
- Logical data model
- Hierarchical data model
- Network data model
- Relational data model
- Database schema
- Relation
- base relation
- virtual relation
- Relational structure
- attribute
- domain
- constraint
- entity integrity
- referential integrity
- domain integrity
- functional dependency
- Database constraint
- Content
- Relational manipulation operations
- relational algebra operations
|
|
-
- union, intersection, difference, Cartesian product
- select, product, join, division
|
> > |
|
|
- Relational database design
|
< < |
-
- Functional dependency
- Normal forms (1NF, 2NF, 3NF, BCNF)
- Multivalued dependency (4NF)
- Joint dependency (PJNF, 5NF)
- Domain Key NF???
- Second order relations???
- Representation theory???
Database languages
|
> > |
-
-
-
-
-
- Mapping conceptual schema to a relational schema
- Normalization
- Normal form
- Anomaly
- Multivalued dependency
- Joint dependency
- Physical data model
- Data model for data integration (data warehousing, data marts)
- Dimensional model
- Star schema
- Database languages
|
|
|
< < |
|
|
|
< < |
|
> > |
|
|
- Data definition languages (DDL)
- Data manipulation languages (DML)
- SQL
|
< < |
-
- Data definition language??? (should this be here, too)
- Data manipulation language??? (should this be here, too)
- Query formulation
- Constraints and integrity enforcement
- Database (persistent) programming languages??? (is this still relevant?)
- Query optimization
- SQL performance tuning/optimization
|
> > |
-
-
- Data definition language
- Constraints
- Integrity enforcement
- Data manipulation language
- Optimization techniques
|
|
- QBE and 4th-generation environments
|
< < |
|
|
- Reporting languages and tools
|
< < |
- XML
- Schema definition languages
|
> > |
-
- Persistent programming languages
- Object Query Language
- XQuery
- XPath
|
|
- Stored procedures
- Triggers
- Embedding non-procedural queries in a procedural language
|
< < |
Transaction processing
|
> > |
|
|
|
> > |
|
|
-
- Failure and recovery
- Concurrency control
|
< < |
Distributed databases
|
> > |
|
|
- Distributed data storage
- Data fragmentation
- Data replication
|
|
- Distributed concurrency control
- Distinguished copy technique
- Voting method
|
< < |
- Homogeneous, heterogeneous, and federated databases
|
> > |
-
- Homogeneous
- Heterogeneous
|
|
-
- Data translation
- Program translation
|
> > |
|
|
|
< < |
Physical database design
- Storage and file structure
- Record
- Fixed and variable length
- Record type
- File
- Indexed files
- Primary, secondary, and clustering indexes
- Hashed files
- External and internal hashing techniques
- Signature files
- B-trees
- Dynamic, multilevel indexes
- Files with dense index
- Files with variable length records
|
> > |
|
|
- Database efficiency and tuning
|
< < |
- Storage requirements for different data types: characters, numbers, strings, text, sound, and video
|
|
- Characteristics of physical storage devices
- Data compression
|
< < |
|
> > |
|
|
Decision support
- On-line analytical processing
|
|
|
< < |
-
- Associative and sequential patterns
- Data clustering
- Market basket analysis
|
> > |
-
- Patterns
- Association rules
- Clustering
- Frequent sets
|
|
-
- Data cleaning
- Data visualization
- Effects of data problems on data mining results
|
< < |
-
-
- Noise, redundancy, and outliers
|
> > |
-
-
- Noise
- Redundancy
- Outliers
|
|
|
< < |
Storage and retrieval of unstructured and semistructured information
|
> > |
Storage and retrieval of unstructured information
|
|
- Content analysis and indexing
- Classification and categorization
|
> > |
-
-
- Classification techniques
|
|
-
-
- Metadata
- Thesauri
- Ontologies
|
|
-
- Morphological analysis, stemming, phrases, stop lists
- Term frequency distributions, uncertainty, fuzziness, weighting
- Vector space, probabilistic, logical, and advanced models
|
< < |
-
- Textual information summarization and visualization
|
> > |
-
- Summarization and visualization
|
|
-
- Abstracting methods
- Dictionaries
- Information search and retrieval
- Effectiveness: precision and recall
- Clustering
- Information filtering
|
< < |
|
|
-
- Relevance feedback
- Retrieval process
|
> > |
|
|
-
- Search process and strategy
- Selection process
|
< < |
-
- Information seeking behavior
|
|
|
< < |
-
- Information needs, relevance, evaluation, effectiveness
- Characters, strings, coding, text
|
> > |
-
-
- Information seeking behavior
- Information need analysis
|
|
- Documents, electronic publishing
|
< < |
- Concept of document markup and markup languages
|
|
- Routing and (community) filtering
- Protocols and systems (including Z39.50, OPACs, WWW engines, research systems)
|
> > |
Storage and retrieval of semistructured information
- Web data
- Markup language
- HTML
- SGML
- XML
- tagging
- document nodes
- element nodes
- attribute nodes
- text nodes
- document order
- well-formedness
- namespace
- DTD
- XML Schema
- Validity
- Simple types
- Complex types
- Anonymous types
- Key
- Refkey
- Query and restructuring language
|
|
Hypertext and hypermedia
- Hypertext models (early history, web, Dexter, Amsterdam, HyTime?)
- Link services, engines, and (distributed) hypertext architectures
|
|
- Dimensions, units, locations, spans
- Browsing, navigation, views, zooming
- Automatic link generation
|
< < |
- Presentation, transformations, synchronization (of what)???
|
> > |
- Presentation, transformations, synchronization
|
|
- Authoring, reading, and annotation
|
< < |
- Protocols and systems (including web, HTTP)???
|
> > |
- Protocols and systems (including web, HTTP)
|
|
Multimedia information and systems (should multimedia information and systems be separated???)
- Devices, device drivers, control signals and protocols, DSPs
|
|
Managing the Database Environment
- Roles and responsibilities of database administrator function
- Database administration
|
< < |
|
> > |
-
- Transaction management and concurrency
|
|
|
> > |
|
|
-
- Backup, recovery, and restart
- Redundancy
- Replication
- Logging
|
> > |
|
|
|
> > |
|
|
-
-
- Prevention of unauthorized access
- Protection against malware
- Privacy
|
< < |
-
- Ownership and access control
|
> > |
-
- Ownership and access control; authorization techniques
|
|
- Data management audits
- Data management architectures
|
|
|
< < |
|
|
Database Application Interface
|
< < |
|
|
|
< < |
-
- Web services (??? Does this belong here)
|
> > |
|
|
|
> > |
[needs to be completed]
|
|
Special purpose databases
- Temporal databases
- Spatial databases and GIS
- Scientific databases
- Statistical databases
|
> > |
|
|
|
> > |
|
|
Knowledge management
|
|
- Object representation
- Primitive data items
|
> > |
|