Abstract : Monte Carlo simulations (MCS) play a key role in medical applications, especially in emission imaging (EI) and in radiotherapy (RT). Unfortunately MCS are also associated with long calculation times and for this reason are not currently employed in routine clinical practice. A solution based on the use of computer clusters to solve the intensive computational issues is not realistic within the routine clinical environment. Recently graphics processing units (GPU) became, in many domains, a cheap solution for the acquisition of a high power computation. The objective of this work was to develop an efficient framework for the implementation of MCS on GPU architecture. Geant4 was used as the MCS targeting medical applications in imaging and radiotherapy fields. We propose the definition of a global strategy and associated structure for such a GPU based simulation. The different steps used for a Geant4 simulation were implemented on GPU. This simulation was conceived to make use of one thread per particle which is equivalent to processing in parallel a stack of particles. For an efficient implementation, particles are simulated at different stages of stack processing. Those stages are associated with the particles generation, navigation, and physical interaction. This "stacking" approach synchronizes particles on the same process and hence reduces the impact of conditional branching due to the stochastic nature of MCS. The first validations have shown equivalence in the underlying physics process between Geant4 and the GPU code. Based on these simplistic simulations, we are expecting a speedup factor of over 200 for a complete simulation in emission tomography or in radiotherapy dosimetry.