Information Management

From CS2001 Wiki

Jump to: navigation, search

Contents

IM/InformationModels [core]

IM1. Information models and systems [core]

Minimum core coverage time: 4 hours

Topics:

  • Information storage and retrieval (IS&R)
  • Information management applications
  • Information capture and representation
  • Metadata/schema association with data
  • Analysis and indexing
  • Search, retrieval, linking, navigation
  • Declarative and navigational queries
  • Information privacy, integrity, security, and preservation
  • Scalability, efficiency, and effectiveness
  • Concepts of Information Assurance (data persistence, integrity)

Learning objectives:

  1. Compare and contrast information with data and knowledge.
  2. Critique/defend a small- to medium-size information application with regard to its satisfying real user information needs.
  3. Show uses of explicitly stored metadata/schema associated with data
  4. Explain uses of declarative queries
  5. Give a declarative version for a navigational query
  6. Describe several technical solutions to the problems related to information privacy, integrity, security, and preservation.
  7. Explain measures of efficiency (throughput, response time) and effectiveness (recall, precision).
  8. Describe approaches to ensure that information systems can scale from the individual to the global.
  9. Identify issues of data persistence to an organization.
  10. Describe vulnerabilities to data integrity in specific scenarios.

IM/DatabaseSystems [core]

Minimum core coverage time: 3 hours

Topics:

  • History and motivation for database systems
  • Components of database systems
  • DBMS functions
  • Database architecture and data independence
  • Use of a declarative query language

Learning objectives:

  1. Explain the characteristics that distinguish the database approach from the traditional approach of programming with data files.
  2. Cite the basic goals, functions, models, components, applications, and social impact of database systems.
  3. Describe the components of a database system and give examples of their use.
  4. Identify major DBMS functions and describe their role in a database system.
  5. Explain the concept of data independence and its importance in a database system.
  6. Use a declarative query language to elicit information from a database.

IM/DataModeling [core]

Minimum core coverage time: 4 hours

Topics:

  • Data modeling
  • Conceptual models (such as entity-relationship or UML)
  • Object-oriented model
  • Relational data model
  • Semistructured data model (expressed using DTD or XMLSchema, for example)

Learning objectives:

  1. Categorize data models based on the types of concepts that they provide to describe the database structure—that is, conceptual data model, physical data model, and representational data model.
  2. Describe the modeling concepts and notation of the entity-relationship model and UML, including their use in data modeling.
  3. Describe the main concepts of the OO model such as object identity, type constructors, encapsulation, inheritance, polymorphism, and versioning.
  4. Define the fundamental terminology used in the relational data model .
  5. Describe the basic principles of the relational data model.
  6. Illustrate the modeling concepts and notation of the relational data model.
  7. Describe the differences between relational and semistructured data models
  8. Give a semistructured equivalent (eg in DTD or XMLSchema) for a given relational schema

IM/Indexing [Elective]

  • The massive impact of indexes on query performance
  • The basic structure of an index;
  • Keeping a buffer of data in memory;
  • Creating indexes with SQL;
  • Indexing text;
  • Indexing the web (how search engines work)

Learning Objectives

  1. Generate an index file for a collection of resources.
  2. Explain the role of an inverted index in locating a document in a collection
  3. Explain how stemming and stop words affect indexing
  4. Identify appropriate indices for given relational schema and query set
  5. Estimate time to retrieve information, when indices are used compared to when they are not used.

IM/RelationalDatabases [elective]

Topics:

  • Mapping conceptual schema to a relational schema
  • Entity and referential integrity
  • Relational algebra and relational calculus

Learning objectives:

  1. Prepare a relational schema from a conceptual model developed using the entity- relationship model
  2. Explain and demonstrate the concepts of entity integrity constraint and referential integrity constraint (including definition of the concept of a foreign key).
  3. Demonstrate use of the relational algebra operations from mathematical set theory (union, intersection, difference, and cartesian product) and the relational algebra operations developed specifically for relational databases (select (restrict), project, join, and division).
  4. Demonstrate queries in the relational algebra..
  5. Demonstrate queries in the tuple relational calculus.

IM/QueryLanguages [elective]

Topics:

  • Overview of database languages
  • SQL (data definition, query formulation, update sublanguage, constraints, integrity)
  • QBE and 4th-generation environments
  • Embedding non-procedural queries in a procedural language
  • Introduction to Object Query Language
  • Stored procedures

Learning objectives:

  1. Create a relational database schema in SQL that incorporates key, entity integrity, and referential integrity constraints.
  2. Demonstrate data definition in SQL and retrieving information from a database using the SQL SELECT statement.
  3. Evaluate a set of query processing strategies and select the optimal strategy.
  4. Create a non-procedural query by filling in templates of relations to construct an example of the desired query result.
  5. Embed object-oriented queries into a stand-alone language such as C++ or Java (e.g., SELECT Col.Method() FROM Object).
  6. Write a stored procedure that deals with parameters and has some control flow, to provide a given functionality

IM/RelationalDatabaseDesign[elective]

Topics:

  • Database design
  • Functional dependency
  • Decomposition of a schema; lossless-join and dependency-preservation properties of a decomposition
  • Candidate keys, superkeys, and closure of a set of attributes
  • Normal forms (1NF, 2NF, 3NF, BCNF)
  • Multivalued dependency (4NF)
  • Join dependency (PJNF, 5NF)
  • Representation theory

Learning objectives:

  1. Determine the functional dependency between two or more attributes that are a subset of a relation.
  2. Connect constraints expressed as primary key and foreign key, with functional dependencies
  3. Compute the closure of a set of attributes under given functional dependencies
  4. Determine whether or not a set of attributes form a superkey and/or candidate key for a relation with given functional dependencies
  5. evaluate a proposed decomposition, to say whether or not it has lossless-join and dependency-preservation
  6. Describe what is meant by 1NF, 2NF, 3NF, and BCNF.
  7. Identify whether a relation is in 1NF, 2NF, 3NF, or BCNF.
  8. Normalize a 1NF relation into a set of 3NF (or BCNF) relations and denormalize a relational schema.
  9. Explain the impact of normalization on the efficiency of database operations, especially query optimization.
  10. Describe what is a multivalued dependency and what type of constraints it specifies.
  11. Explain why 4NF is useful in schema design.

IM/TransactionProcessing [elective]

Topics:

  • Transactions
  • Failure and recovery
  • Concurrency control

Learning objectives:

  1. Create a transaction by embedding SQL into an application program.
  2. Explain the concept of implicit commits.
  3. Describe the issues specific to efficient transaction execution.
  4. Explain when and why rollback is needed and how logging assures proper rollback.
  5. Explain the effect of different isolation levels on the concurrency control mechanisms.
  6. Choose the proper isolation level for implementing a specified transaction protocol.



IM/DistributedDatabases [elective]

Topics:

  • Distributed data storage
  • Distributed query processing
  • Distributed transaction model
  • Concurrency control
  • Homogeneous and heterogeneous solutions
  • Client-server

Learning objectives:

  1. Explain the techniques used for data fragmentation, replication, and allocation during the distributed database design process.
  2. Evaluate simple strategies for executing a distributed query to select the strategy that minimizes the amount of data transfer.
  3. Explain how the two-phase commit protocol is used to deal with committing a transaction that accesses databases stored on multiple nodes.
  4. Describe distributed concurrency control based on the distinguished copy techniques and the voting method.
  5. Describe the three levels of software in the client-server model.


IM/PhysicalDatabaseDesign [elective]

Topics:

  • Storage and file structure
  • Indexed files
  • Hashed files
  • Signature files
  • B-trees
  • Files with dense index
  • Files with variable length records
  • Database efficiency and tuning

Learning objectives:

  1. Explain the concepts of records, record types, and files, as well as the different techniques for placing file records on disk.
  2. Give examples of the application of primary, secondary, and clustering indexes.
  3. Distinguish between a nondense index and a dense index.
  4. Implement dynamic multilevel indexes using B-trees.
  5. Explain the theory and application of internal and external hashing techniques.
  6. Use hashing to facilitate dynamic file expansion.
  7. Describe the relationships among hashing, compression, and efficient database searches.
  8. Evaluate costs and benefits of various hashing schemes.
  9. Explain how physical database design affects database transaction efficiency.


IM/DataMining [elective]

Topics:

  • The usefulness of data mining
  • Associative and sequential patterns
  • Data clustering
  • Market basket analysis
  • Data cleaning
  • Data visualization


Learning objectives:


  • Compare and contrast different conceptions of data mining as evidenced in both research and application.
  • Explain the role of finding associations in commercial market basket data.
  • Characterize the kinds of patterns that can be discovered by association rule mining.
  • Describe how to extend a relational system to find patterns using association rules.
  • Evaluate methodological issues underlying the effective application of data mining.
  • Identify and characterize sources of noise, redundancy, and outliers in presented data.
  • Identify mechanisms (on-line aggregation, anytime behavior, interactive visualization) to close the loop in the data mining process.
  • Describe why the various close-the-loop processes improve the effectiveness of data mining.


IM/InformationStorageAndRetrieval [elective]

Topics:

  • Characters, strings, coding, text
  • Documents, electronic publishing, markup, and markup languages
  • Tries, inverted files, PAT trees, signature files, indexing
  • Morphological analysis, stemming, phrases, stop lists
  • Term frequency distributions, uncertainty, fuzziness, weighting
  • Vector space, probabilistic, logical, and advanced models
  • Information needs, relevance, evaluation, effectiveness
  • Thesauri, ontologies, classification and categorization, metadata
  • Bibliographic information, bibliometrics, citations
  • Routing and (community) filtering
  • Search and search strategy, information seeking behavior, user modeling, feedback
  • Information summarization and visualization
  • Integration of citation, keyword, classification scheme, and other terms
  • Protocols and systems (including Z39.50, OPACs, WWW engines, research systems)

Learning objectives:

  1. Explain basic information storage and retrieval concepts.
  2. Describe what issues are specific to efficient information retrieval.
  3. Give applications of alternative search strategies and explain why the particular search strategy is appropriate for the application.
  4. Perform Internet-based research.
  5. Design and implement a small to medium size information storage and retrieval system.

IM/Hypermedia [elective]

Topics:

  • Hypertext models (early history, web, Dexter, Amsterdam, HyTime)
  • Link services, engines, and (distributed) hypertext architectures
  • Nodes, composites, and anchors
  • Dimensions, units, locations, spans
  • Browsing, navigation, views, zooming
  • Automatic link generation
  • Presentation, transformations, synchronization
  • Authoring, reading, and annotation
  • Protocols and systems (including web, HTTP)

Learning objectives:

  1. Summarize the evolution of hypertext and hypermedia models from early versions up through current offerings, distinguishing their respective capabilities and limitations.
  2. Explain basic hypertext and hypermedia concepts.
  3. Demonstrate a fundamental understanding of information presentation, transformation, and synchronization.
  4. Compare and contrast hypermedia delivery based on protocols and systems used.
  5. Design and implement web-enabled information retrieval applications using appropriate authoring tools.


IM/MultimediaSystems [elective]

Topics:

Devices, device drivers, control signals and protocols, DSPs

  • Applications, media editors, authoring systems, and authoring
  • Streams/structures, capture/represent/transform, spaces/domains, compression/coding
  • Content-based analysis, indexing, and retrieval of audio, images, and video
  • Presentation, rendering, synchronization, multi-modal integration/interfaces
  • Real-time delivery, quality of service, audio/video conferencing, video-on-demand

Learning objectives:

  1. Describe the media and supporting devices commonly associated with multimedia information and systems.
  2. Explain basic multimedia presentation concepts.
  3. Demonstrate the use of content-based information analysis in a multimedia information system.
  4. Critique multimedia presentations in terms of their appropriate use of audio, video, graphics, color, and other information presentation concepts.
  5. Implement a multimedia application using a commercial authoring system.


IM/DigitalLibraries [elective]

Topics:

  • Digitization, storage, and interchange
  • Digital objects, composites, and packages
  • Metadata, cataloging, author submission
  • Naming, repositories, archives
  • Spaces (conceptual, geographical, 2/3D, VR)
  • Architectures (agents, buses, wrappers/mediators), interoperability
  • Services (searching, linking, browsing, and so forth)
  • Intellectual property rights management, privacy, protection (watermarking)
  • Archiving and preservation, integrity

Learning objectives:

  1. Explain the underlying technical concepts in building a digital library.
  2. Describe the basic service requirements for searching, linking, and browsing.
  3. Critique scenarios involving appropriate and inappropriate use of a digital library, and determine the social, legal, and economic consequences for each scenario.
  4. Describe some of the technical solutions to the problems related to archiving and preserving information in a digital library.
  5. Design and implement a small digital library.
To give feedback on this area of revision, go to here and use your ACM user name and login.

Copyright © 2008, ACM, Inc. and IEEE, Inc.

Personal tools