|   | 
				
					
	
		  | 
	 
	
		| Paper: | 
		Benchmarking the CRBLASTER  Computational Framework on the 350-MHz 49-core Maestro Development Board | 
	 
	
		| Volume: | 
		461, Astronomical Data Analysis Software and Systems XXI | 
	 
	
		| Page: | 
		13 | 
	 
	
		| Authors: | 
		Mighell, K. J. | 
	 
	
	
		| Abstract: | 
		I describe the performance of the CRBLASTER computational framework on a 350-MHz 49-core Maestro Development Board (MBD). The 49-core Interim Test Chip (ITC) was developed by the U.S. Government and is based on the intellectual property of the 64-core TILE64 processor of the Tilera Corporation. The Maestro processor is intended for use in the high radiation environments found in space; the ITC was fabricated using IBM 90-nm CMOS 9SF technology and Radiation-Hardening-by-Design (RHDB) rules. CRBLASTER is a parallel-processing cosmic-ray rejection application based on a simple computational framework that uses the high-performance computing industry standard Message Passing Interface (MPI) library. CRBLASTER was designed to be used by research scientists to easily port image-analysis programs based on embarrassingly-parallel algorithms to a parallel-processing environment such as a multi-node Beowulf cluster or multi-core processors using MPI. I describe my experience of porting CRBLASTER to the 64-core TILE64 processor, the Maestro simulator, and finally the 49-core Maestro  processor itself. Performance comparisons using the ITC are presented between emulating all floating-point operations in software and doing all floating point operations with hardware assist from an IEEE-754 compliant Aurora FPU (floating point unit) that is attached to each of the 49 cores. Benchmarking of the CRBLASTER computational framework using the memory-intensive L.A.COSMIC cosmic ray rejection algorithm and a computational-intensive Poisson noise generator reveal subtleties of the Maestro hardware design. Lastly, I describe the importance of using real scientific applications during the testing phase of next-generation computer hardware; complex real-world scientific applications can stress hardware in novel ways that may not necessarily be revealed while executing simple applications or unit tests. | 
	 
	
		| 
			
			
		 | 
	 
	
		  | 
	 
 
					 
				 | 
				  |