|   | 
				
					
	
		  | 
	 
	
		| Paper: | 
		The Distributed Cloud Based Engine for Knowledge Discovery in  Massive Archives of Astronomical Spectra | 
	 
	
		| Volume: | 
		512, Astronomical Data Analysis Software and Systems XXV | 
	 
	
		| Page: | 
		689 | 
	 
	
		| Authors: | 
		Škoda, P.; Koza, J.; Palička, A.; Lopatovský, L.; Peterka, T. | 
	 
	
	
		| Abstract: | 
		The current archives of large-scale spectroscopic surveys, such as SDSS or
 LAMOST, contain millions of spectra.  As some interesting objects  (e.g.
 emission line stars or quasars) can be identified only by checking the shapes of
 certain spectral lines,  machine learning techniques have to be applied,
 complemented by flexible visualisation of results.
 
 We present VO-CLOUD, the distributed cloud-based engine, providing the
 user with a comfortable web-based environment for conducting  machine
 learning experiments  with different algorithms running on multiple nodes. It
 allows visual backtracking of the individual input spectra at different stages
 of preprocessing, which is important for checking the nature of outliers or
 precision of classification.  
 
 The engine consists of a single master server, representing  the user portal,
 and several  workers, running various types of  machine learning tasks.  The
 master  holds the  database of users and their experiments, predefined
 configuration parameters for individual machine learning models and a
 repository for data to be preprocessed.  The workers have different
 capabilities based on the installed libraries and the hardware configuration of their host
 (e.g. number of CPU cores or GPU card type) and more may be dynamically added to
 provide new machine learning methods. | 
	 
	
		| 
			
			
		 | 
	 
	
		  | 
	 
 
					 
				 | 
				  |