CS 885 : Future Multicore Architectures and their
Software
|
|
John L. Hennessy and David A. Patterson,
Computer Architecture: A Quantitative Approach,
Morgan Kaufmann Publishers, Fourth Edition, 2008.
- Introduction to Multiprocessors (May 9th week : 4.1 )
- Advanced Optimization of Cache (May 16th week 5.2,5,3)
- AMD Opteron Memory Hierarchy (Optional)
- Symmetric Shared-Memory Architectures (May 16th week : 4.2)
- Distributed Shared-Memory Architectures (optional)
- Models of Memory Consistency: An Introduction (May 23rd week : 4.6)
- Synchronization (May 23rd week : 4.5)
- Multithreading: Exploiting ILP to support TLP (Section 3.5)
- Putting It All Together: Sun T1 (optional)
Related sections from Chapter
3, Chapter
4, Chapter
5 (Online pdf for SFU CS 885 students only)
Introduction
Herb Sutter,
The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software,
Dr. Dobb's Journal, 30(3), March 2005.
Html.
Introduction to Parallel Computing
LLNL Web Site
(html). Reference.
David A. Wood and Mark D. Hill,
Cost-Effective Parallel Computing,
IEEE Computer,
February 1995.
PDF
Herb Sutter and James Larus,
Software and the Concurrency Problem,
ACM Queue,
September 2005,
PDF
Queue - Multiprocessors,
ACM Queue,
September 2005
PDF Reference.
Multicore Processors
Poonacha Kongetira, Kathirgamar Aingaran,
Kunle Olukotun,
SUN Niagara : A 32-Way Multithreaded Sparc Processor,
IEEE Micro
HREF="http://www-hydra.stanford.edu/~kunle/publications/niagra_micro.pdf"> pdf
Kunle Olukotun, Basem Nayfeh, Lance Hammond, Ken Wilson, and
Kunyung Chang, The Case for a Single-Chip Multiprocessor
Proceedings of the Seventh ACM Conference on Architectural Support
for Programming Languages and Operating Systems., ASPLOS 2006.
PDF.
Luiz Andre Barroso, et al.,
DEC's Multicore in 2000 (Piranha),
Proc. International Symposium on Computer Architecture, June 2000.
PDF Reference.
Programming
POSIX Threads Programming,
Web Site
(html). Reference.
LLNL OpenMP Tutorial,
Web Site
(html).
OpenMP: Simple, Portable, Scalable SMP Programming,
Web Site
(html). Reference.
CILK Programming,
Web Site
(html). Reference.
Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh,
and Anoop Gupta,
The SPLASH-2 Programs: Characterization and Methodological Considerations,
Proc. International Symposium on Computer Architecture,
June 1995.
PDF
John M. Mellor-Crummey and Michael L. Scott,
Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors
ACM Trans. on Computer Systems.
February 1991, pp. 21-65.
PDF
Rob Von Behren , Jeremy Condit , Feng Zhou , George C. Necula , Eric Brewer,
Capriccio: Scalable Threads for Internet Services
Proc. Symposium on Operating System Principles,
October 2003.
PDF.
Suggested Weekly Meetings (Class: atleast 2 of 3 : Presenters:
All papers. Welcome to include other papers. Weekly meeting topics
will need be presented in order. Individual papers can be presented in
any order)
Instructions for presentation here
Multicore Processors
How much Parallelism is available ?
Emily Fortuna, Owen Anderson, Luis Ceze, Susan Eggers,
A Limit Study of JavaScript Parallelism
Proc Intl. Symposium on Workload Characterization , Dec 2010
PDF
Geoffrey Blake, Ronald G. Dreslinski, Trevor Mudge and Krisztian
Flautner, Evolution of Thread-Level Parallelism in Desktop
Applications. The 37th ISCA, June 2010
PDF
Milind Kulkarni, Martin Burtscher, Rajasekhar Inkulu, Keshav Pingali
and Calin Cascaval,How Much Parallelism is There in Irregular
Applications?, In PPOPP 2009,
PDF
Caches
Changkyu Kim, Doug Burger and Stephen W. Keckler
An Adaptive, Non-Uniform Cache Structure for Wire-Delay Dominated
On-Chip Caches, In ASPLOS 2002.
PDF
Lisa R. Hsu, Steven K. Reinhardt, Ravishankar Iyer and Srihari Makineni
Communist, Utilitarian, and Capitalist Cache Policies on CMPs: Caches as a Shared Resource
In PACT 2006
PDF
Niti Madan , Li Zhao , Naveen Muralimanohar et al.
Optimizing Communication and Capacity in a 3D Stacked Reconfigurable Cache Hierarchy
PDF
Hardware Coherence
Alan Charlseworth
The Sun Fireplane System Interconnect
PDF
Virtual Hierarchies to Support Server Consolidation,
Michael R. Marty and Mark D. Hill
International Symposium on Computer Architecture (ISCA), June 2007
PDF
Arun Raghavan, Colin Blundell, and Milo M. K. Martin
Token Tenure: PATCHing Token Counting Using Directory-Based Cache
Coherence. In MICRO 2008
PDF
Software-based Multiprocessors
Robert Stets, Sandhya Dwarkadas, Nikolaos Hardavellas et al.,
CASHMERE-2L: Software Coherent Shared Memory on a Clustered
Remote-Write Network. In SOSP'97
PS
Amza, C.; Cox, A.L.; Dwarkadas, S.; Keleher, P.; Honghui Lu;
Rajamony, R.; Weimin Yu; Zwaenepoel, W. TreadMarks: shared memory
computing on networks of workstations. In IEEE Computer
PDF
Steven K. Reinhardt, Robert W. Pfile, David A. Wood.
Decoupled Hardware Support for Distributed Shared Memory. In ISCA 1996
PDF
Deterministic Computing
Joseph Devietti, Brandon Lucia, Luis Ceze and Mark Oskin.
DMP: Deterministic Shared Memory Multiprocessing
In ASPLOS 2009
PDF
Marek Olszewski, Jason Ansel, Saman Amarasinghe.
Kendo: Efficient Deterministic Multithreading in Software.
In ASPLOS 2009
PDF
Derek R. Hower, Pablo Montesinos, Luis Ceze, Mark D. Hill, and Josep Torrellas
Two Hardware-based Approaches for Deterministic Multiprocessor Replay
PDF
Overview paper
Tom Bergan, Joseph Devietti, Nicholas Hunt and Luis Ceze.
The Deterministic Execution Hammer: How Well Does it Actually Pound Nails?
In Workshop on Determinism (WoDET w/ ASPLOS) 2012.
PDF
Energy Efficiency Processors
Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karu Sankaralingam, Doug Burger
Modeling Dark Silicon and the Limits of CMP Parallelism, In ISCA 2012
PDF
Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Ben Lee,
Stephen Richardson, Christos Kozyrakis, Mark Horowitz.
Understanding Sources of Inefficiency in General-Purpose Chips. In
ISCA 2010
PDF
Ganesh Venkatesh, John Sampson, Nathan Goulding et al.
Conservation Cores: Reducing the Energy of Mature Computations.
In ASPLOS 2010
PDF
Gaming Processors
XBOX 360 SYSTEM ARCHITECTURE
PDF
Jacob Leverich, Hideho Arakida, Alex Solomatnikov, Amin
Firoozshahian, Mark Horowitz, Christos Kozyrakis. Comparing Memory
Systems for Chip Multiprocessors, In ISCA 2007
PDF
Venkatraman Govindaraju, Peter Djeu, Karthikeyan Sankaralingam, Mary
Vernon, and William R. Mark. Toward a Multicore Architecture for
Real-time Ray-tracing. In Micro 2008
PDF
Johns, C. R.; Brokenshire, D. A.
Introduction to the Cell Broadband Engine Architecture
PDF.
Entire Journal
GPU/Accelerator Architectures
GPU Tutorial
Webpage
Larrabee: A Many-Core x86 Architecture for Visual Computing
Intel
PDF
Carbon: architectural support for fine-grained parallelism on chip
multiprocessors. In ISCA 2007
PDF
System-Level Energy Management
ECOSystem: Managing Energy as a First Class Operating System
Resource. In ASPLOS 2002
PDF
David Meisner, Brian T. Gold, and Thomas F. Wenisch.
PowerNap: Eliminating Server Idle Power.
PDF
Kai Shen, Arrvindh Shriraman, Sandhya Dwarkadas and Xiao Zhang
Energy/Power Containers for Multicores
PDF
Energy-Efficient Memory/Storage
Benjamin C. Lee, Engin Ipek, Onur Mutlu and Doug Burger
Architecting Phase Change Memory as a Scalable DRAM Alternative. In
ISCA 2010
PDF
Alvin R. Lebeck, Xiaobo Fan, Heng Zeng, Carla Ellis
Power Aware Page Allocation
PDF
David Meisner, Brian T. Gold, and Thomas F. Wenisch.
PowerNap: Eliminating Server Idle Power
PDF
Adrian Sampson, Werner Dietl, Emily Fortuna, Danushen Gnanapragasam, Luis Ceze, Dan Grossman.
EnerJ: Approximate Data Types for Safe and General Low-Power
Computation.
PDF
Energy Management in Cellphones
Aaron Carroll and Gernot Heiser
An Analysis of Power Consumption in a Smartphone, In USENIX 2010
PDF
Alex Shye, Ben Scholbrock, Gokhan Memik.
Into the Wild: Studying Real User Activity Patterns to Guide Power
Optimizations for Mobile Architectures.
PDF
Jason Flinn and M. Satyanarayanan.
Energy-Aware Adaptation for Mobile Applications. In SOSP 1999.
PDF
|