Skip to Main content Skip to Navigation
Conference papers

Scalability and the Real World: Lessons Learned Optimizing ATLAS Reconstruction and Simulation Performance on Multicore CPUs

Abstract : Modern computer architectures have evolved towards multi-core, multi-socket CPUs. Exploiting optimally the CPU resources of these machines is a major challenge for complex scientific applications. The software of ATLAS experiment at the LHC [1], is an example of a large, complex and resource-hungry scientific application. Given the amount of existing ATLAS code (millions of lines of code in thousands of libraries) any optimization that requires major code overhauls is not practicable. AthenaMP [2] is a non-intrusive approach to coarse-grained parallelism designed primarily with the goal to reduce memory usage (ATLAS applications require up to 2GB of memory to run). AthenaMP makes use of Linux process forking to create event-processing workers farm and utilizes Copy-on-Write physical memory sharing mechanism among workers and parent. The parent process also takes the role of a work scheduler and merger of the output. The use of multiprocessing instead of multithreading is what makes our approach non-intrusive: since each process exists in its own memory address space the existing code can run in parallel without the notorious synchronization problems of multi-threaded applications. Having demonstrated the feasibility of the AthenaMP, we have studied its performance on various many-core platforms. A major effort has been taken to boost the performance of the AthenaMP on Intel quad core Nehalem CPU systems, which appears to be the platform of choice for the next round of ATLAS hardware procurement. In this study, we showed that the significant scaling enhancements are possible for large-scale scientific applications without major change in the software design. The major improvements are also available from exploiting the new architectural and OS features. The thorough understanding of non-ideal scaling can be reached by utilizing tools like Intel's Performance Tuning Utility (PTU) to profile and analyze all aspects of the complex software on many-core machines.
Document type :
Conference papers
Complete list of metadata
Contributor : Sabine Starita Connect in order to contact the contributor
Submitted on : Friday, November 19, 2010 - 11:29:55 AM
Last modification on : Wednesday, September 16, 2020 - 4:21:36 PM


  • HAL Id : in2p3-00537763, version 1




M. Tatarkhanov, S. Binet, P. Calafiura, K. Jackson, W. Lavrijsen, et al.. Scalability and the Real World: Lessons Learned Optimizing ATLAS Reconstruction and Simulation Performance on Multicore CPUs. 2010 Nuclear Science Symposium and Medical Imaging Conference, Oct 2010, Knoxville, United States. ⟨in2p3-00537763⟩



Record views