Fundamentals of Multimedia
Ze-Nian Li, Mark S. Drew and Jiangchuan Liu

Preface to the Third Edition

In the seventeen years since the first edition of Fundamentals of Multimedia, the field and applications of multimedia have flourished and are undergoing evermore rapid growth and evolution in various emerging interdisciplinary areas. However, a comprehensive textbook to aid the continuous learning and mastering of the fundamental concepts and knowledge in multimedia remains essential.

While the original edition was published by Prentice-Hall, starting from the second edition we have chosen Springer, a prestigious publisher that has a superb and rapidly expanding array of computer science textbooks, particularly the high-quality, dedicated and established textbook series: Texts in Computer Science, of which this textbook forms a part. The second edition included considerable added depth to the networking aspect of the book. To this end Dr. Jiangchuan Liu was added to the team of authors.

This third edition again constitutes a significant revision: the textbook has been thoroughly revised and updated to include recent developments in the field. For example, we updated the introduction to some of the current multimedia tools, we included current topics such as 360◦ video and the video coding standard H.266, new-generation social, mobile and cloud computing for human-centric interactive multimedia, augmented reality and virtual reality, deep learning for multimedia processing, and their attendant technologies.

Multimedia is associated with a rich set of core subjects in Computer Science and Engineering, and we address those here. The book is not an introduction to simple design considerations and tools - it serves a more advanced audience than that. On the other hand the book is not a reference work - it is more a traditional textbook. While we perforce may discuss multimedia tools, we would like to give a sense of the underlying issues at play in the tasks those tools carry out. Students who undertake and succeed in a course based on this text can be said to really understand fundamental matters in regard to this material, hence the title of the text.

In conjunction with this text, a full-fledged course should also allow students to make use of this knowledge to carry out interesting or even wonderful practical projects in multimedia, interactive projects that engage and sometimes amuse and, perhaps, even teach these same concepts.

Who should read this book?

This text aims at introducing the basic ideas used in multimedia, for an audience that is comfortable with technical applications, e.g., Computer Science students and Engineering students. The book aims to cover an upper-level undergraduate multimedia course, but could also be used in more advanced courses. Indeed, a (quite long) list of courses making use of the first two editions of this text includes many undergraduate courses as well as use as a pertinent point of departure for graduate students who may not have encountered these ideas before in a practical way. As well, the book would be a good reference for anyone, including those in industry, who are interested in current multimedia technologies. The selection of material in the text addresses real issues that these learners will be facing as soon as they show up in the workplace. Some topics are simple, but new to the students; some are somewhat complex, but unavoidably so in this emerging area.

The text mainly presents concepts, not applications. A multimedia course, on the other hand, teaches these concepts, and tests them, but also allows students to utilize skills they already know, in coding and presentation, to address problems in multimedia. The accompanying website materials for the text includes some code for multimedia applications along with some projects students have developed in such a course, plus other useful materials best presented in electronic form.

Have the authors used this material in a real class?

Since 1996, we have taught a third-year undergraduate course in Multimedia Systems based on the introductory materials set out in this book. A one-semester course very likely could not include all the material covered in this text, but we have usually managed to consider a good many of the topics addressed, with mention made of a selected number of issues in Parts 3 and 4, within that time frame.

As well, over the same time period and again as a one-semester course, we have also taught a graduatelevel course using notes covering topics similar to the ground covered by this text, as an introduction to more advanced materials. A fourth-year or graduate level course would do well to discuss material from the first three Parts of the book and then consider some material from the last Part, perhaps in conjunction with some of the original research references included here along with results presented at topical conferences.

We have attempted to fill both needs, concentrating on an undergraduate audience but including more advanced material as well. Sections that can safely be omitted on a first reading are marked with an asterisk in the Table of Contents.

What is covered in this text?

In Part 1, Introduction and Multimedia Data Representations, we introduce some of the notions included in the term Multimedia, and look at its present as well as its history. Practically speaking, we carry out multimedia projects using software tools, so in addition to an overview of multimedia software tools we get down to some of the nuts and bolts of multimedia authoring. The representation of data is critical in the study of multimedia, and we look at the most important data representations for use in multimedia applications. Specifically, graphics and image data, video data, and audio data are examined in detail. Since color is vitally important in multimedia programs, we see how this important area impacts multimedia issues.

In Part 2, Multimedia Data Compression, we consider how we can make all this data fly onto the screen and speakers. Multimedia data compression turns out to be a very important enabling technology that makes modern multimedia systems possible. Therefore we look at lossless and lossy compression methods, supplying the fundamental concepts necessary to fully understand these methods. For the latter category, lossy compression, arguably JPEG still-image compression standards, including JPEG 2000, are the most important, so we consider these in detail. But since a picture is worth 1,000 words, and so video is worth more than a million words per minute, we examine the ideas behind the MPEG standards MPEG-1, MPEG-2, MPEG-4, MPEG-7, and beyond into modern video coding standards H.264, H.265, and H.266. Audio compression is treated separately and we consider some basic audio and speech compression techniques and take a look at MPEG Audio, including MP3 and AAC.

In Part 3, Multimedia Communications and Networking, we consider the great demands multimedia communication and content sharing places on networks and systems. The Internet, however, was not initially designed for multimedia content distribution and there are significant challenges to be addressed. We discuss the wired Internet and wireless mobile network technologies and protocols, and the enhancements on them that make multimedia communications possible. We further examine state-of-the-art multimedia content distribution mechanisms, as well as modern cloud computing for highly scalable multimedia data processing. The discussion also includes the latest edge computing and serverless computing solutions towards fine-grained and flexible realtime multimedia.

In Part 4, Human-Centric Interactive Multimedia, we examine a number of technologies that form the heart of enabling the new Web 2.0 paradigm, with rich user interactions. Such popular Web 2.0-based social media sharing websites as YouTube, Facebook, Twitter, Twitch, and TikTok have drastically changed the content generation and distribution landscape, and indeed have become an integral part in people's daily life. The development in the coding algorithms and hardware for sensing, communication, and interaction also empower Virtual Reality (VR) and Augmented Reality (AR), providing better immersive experiences beyond 3D. This Part examines these new-generation interactive multimedia services and discusses their potentials and challenges. The huge amount of multimedia content also militates for multimedia-aware search mechanisms, and we therefore consider the challenges and mechanisms for multimedia content search and retrieval.

Textbook website

The book website is http://www.cs.sfu.ca/mmbook. There, the reader will find general information about the book including previous editions, an errata sheet updated regularly, programs that help demonstrate concepts in the text, and a dynamic set of links for the "Further Exploration" section in some of the chapters. Since these links are regularly updated, and of course URLs change quite often, the links are online rather than within the printed text.

Instructors' resources

The main text website has no ID and password, but access to sample student projects is at the instructor's discretion and is password-protected. For instructors, with a different password, the website also contains Course Instructor resources for adopters of the text. These include an extensive collection of online slides, solutions for the exercises in the text, sample assignments and solutions, sample exams, and extra exam questions.

Acknowledgements

We are most grateful to colleagues who generously gave of their time to review this text, and we wish to express our thanks to Edward Chang, Shu-Ching Chen, Qianping Gu, Mohamed Hefeeda, Rachelle S. Heller, Gongzhu Hu, S. N. Jayaram, Tiko Kameda, Joonwhoan Lee, Xiaobo Li, Jie Liang, Siwei Lu, Jiebo Luo, and Jacques Vaisey.

The writing of this text has been greatly aided by a number of suggestions and contributions from present and former colleagues and students. We would like to thank Mohamed Athiq, James Au, Yi Ching David Chou, Chad Ciavarro, Hossein Hajimirsadeghi, Hao Jiang, Mehran Khodabandeh, Steven Kilthau, Michael King, Tian Lan, Chenyu Li, Haitao Li, Cheng Lu, Minlong Lu, You Luo, Xiaoqiang Ma, Hamidreza Mirzaei, Peng Peng, Haoyu Ren, Ryan Shea, Chantal Snazel, Wenqi Song, Yi Sun, Dominic Szopa, Zinovi Tauber, Malte von Ruden, Fangxin Wang, Jian Wang, Jie Wei, Edward Yan, Osmar Zaïane, Cong Zhang, Lei Zhang, Miao Zhang, Wenbiao Zhang, Yuan Zhao, Ziyang Zhao, William Zhong, Qiang Zhu, and Yifei Zhu for their assistance. As well, Dr. Ye Lu made great contributions to Chapters 8 and 9; Andy Sun contributed Chapter 20. Their valiant efforts are particularly appreciated. We are also most grateful for the students who generously made their course projects available for instructional use for this book.