<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
<title>Gene D. Cooperman</title>
<copyright>Copyright (c) 2012  All rights reserved.</copyright>
<link>http://works.bepress.com/gcooperman</link>
<description>Recent documents in Gene D. Cooperman</description>
<language>en-us</language>
<lastBuildDate>Sat, 24 Nov 2012 05:27:51 PST</lastBuildDate>
<ttl>3600</ttl>








<item>
<title>TOP-C: a task-oriented parallel C interface</title>
<link>http://works.bepress.com/gcooperman/13</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/13</guid>
<pubDate>Fri, 21 Jan 2011 08:37:07 PST</pubDate>
<description>
	<![CDATA[
	<p>The goal of this work is to simplify parallel application development, and thus ease the learning barriers faced by non-experts. It is especially useful where there is little data-parallelism to be recognized by a compiler. The applications programmer need learn the intricacies of only one primary subroutine in order to get the full benefits of the parallel interface. The applications programmer defines a high level concept, the task, that depends only on his application, and not on any particular parallel library. The task is defined by its three phases: (a) the task input, (b) sequential code to execute the task, and (c) any modifications of global variables that occur as a result of the task. In particular, side effects (which change global variable values) must not occur in phase (b). Forcing the user to re-organize his computation in these terms allows us to present the applications programmer with a single global environment visible to all processors (whether on a SMP or a NOW architecture), in the context of a master-slave architecture.</p>
<p>Both a shared memory implementation (running on an SGI or SUN Solaris architecture) and a NOW memory implementation (running on top of MPI) are described. The implementations were tested by a naive program for integer factorization, and by a more sophisticated Todd-Coxeter coset enumeration. Integer factorization was chosen so as to exercise the major features of TOP-C in an unambiguous context.</p>

	]]>
</description>

<author>Gene Cooperman</author>


</item>






<item>
<title>Transparent adaptive library-based checkpointing for master-worker style parallelism</title>
<link>http://works.bepress.com/gcooperman/12</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/12</guid>
<pubDate>Wed, 22 Dec 2010 14:11:22 PST</pubDate>
<description>
	<![CDATA[
	<p>We present a transparent, system-level checkpointing solution for master-worker parallelism that automatically adapts, upon restart, to the number of processor nodes available. This is important, since nodes in a cluster fail. It also allows one to adapt to using multiple cluster partitions and multiple resources from the Computational Grid, as they become available. Checkpointing a master-worker computation has the additional advantage of needing to checkpoint only the master process. This is both fast and more economical of disk space. This has been demonstrated by checkpointing Geant4, a million line C++ program. Our solution has been implemented in the context of TOP-C (Task Oriented Parallel C/C++), a free, open-source parallel package, although it can easily be ported to additional master-worker packages.</p>

	]]>
</description>

<author>Gene Cooperman et al.</author>


</item>






<item>
<title>Parallelization of Geant4 using TOP-C and Marshalgen</title>
<link>http://works.bepress.com/gcooperman/11</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/11</guid>
<pubDate>Wed, 22 Dec 2010 14:11:20 PST</pubDate>
<description>
	<![CDATA[
	<p>Geant4 is a very large, highly accurate toolkit for Monte Carlo simulation of particle-matter interaction. It has been applied to high-energy physics, cosmic ray modeling, radiation shields, radiation therapy, mine detection, and other areas. Geant4 is being used to help design some high energy physics experiments (notably CMS and Atlas) to be run on the future large hadron collider: the largest particle collider in the world. The parallelization, ParGeant4, represents a challenge due to the unique characteristics of Geant4: (i) complex object-oriented design; (ii) intrinsic use of templates and abstract classes to be instantiated later by the end user; (iii) large program with many developers; and (iv) frequent releases. The key issue for parallelization is not just how to parallelize "correctly" but also how to parallelize "with minimum effort". In addition, the parallelization should make as few assumptions about the source code as possible, due to the frequent release schedule of Geant4. We use TOP-C (Task Oriented Parallel C/C++) for parallelization and Marshalgen for marshaling/serialization. In some examples on a cluster of 100 nodes yielded a speedup of up to 94.4. The code’s portability, scalability and performance are also discussed.</p>

	]]>
</description>

<author>Gene Cooperman et al.</author>


</item>






<item>
<title>Fast query processing by distributing an index over CPU caches</title>
<link>http://works.bepress.com/gcooperman/10</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/10</guid>
<pubDate>Fri, 10 Dec 2010 13:15:32 PST</pubDate>
<description>
	<![CDATA[
	<p>Data intensive applications on clusters often require requests quickly be sent to the node managing the desired data. In many applications, one must look through a sorted tree structure to determine the responsible node for accessing or storing the data. Examples include object tracking in sensor networks, packet routing over the internet, request processing in publish-subscribe middleware, and query processing in database systems. When the tree structure is larger than the CPU cache, the standard implementation potentially incurs many cache misses for each lookup; one cache miss at each successive level of the tree. As the CPURAM gap grows, this performance degradation will only become worse in the future. We propose a solution that takes advantage of the growing speed of local area networks for clusters. We split the sorted tree structure among the nodes of the cluster. We assume that the structure will fit inside the aggregation of the CPU caches of the entire cluster. We then send a word over the network (as part of a larger packet containing other words) in order to examine the tree structure in another node’s CPU cache. We show that this is often faster than the standard solution, which locally incurs multiple cache misses while accessing each successive level of the tree.</p>
<p>The principle is demonstrated with a cluster configured with Pentium III nodes connected with a Myrinet network. The new approach is shown to be 50% faster on this current cluster. In the future, the new approach is expected to have a still greater advantage as networks grow in speed, and as cache lines grow in length (greater cache miss penalty). This can be used to successfully overcome the inherent memory latency associated with cache misses.</p>

	]]>
</description>

<author>Xiaoqin Ma et al.</author>


</item>






<item>
<title>Self-pulsing and chaos in distributed feedback bistable optical devices</title>
<link>http://works.bepress.com/gcooperman/9</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/9</guid>
<pubDate>Fri, 10 Dec 2010 13:15:31 PST</pubDate>
<description>
	<![CDATA[
	<p>We show that the light transmitted by a nonlinear distributed feedback structure can be steady (time independent), periodic, or chaotic depending on the intensity of the input cw beam. The feasibility of an experimental demonstration of such behavior is discussed.</p>

	]]>
</description>

<author>Herbert G. Winful et al.</author>


</item>






<item>
<title>End of year report : FY 2005</title>
<link>http://works.bepress.com/gcooperman/8</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/8</guid>
<pubDate>Fri, 10 Dec 2010 13:15:30 PST</pubDate>
<description>
	<![CDATA[
	
	]]>
</description>

<author>Gene Cooperman et al.</author>


<category>Corporation reports</category>

</item>






<item>
<title>Air shower simulation using GEANT4 and commodity parallel computing</title>
<link>http://works.bepress.com/gcooperman/7</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/7</guid>
<pubDate>Fri, 10 Dec 2010 13:15:29 PST</pubDate>
<description>
	<![CDATA[
	<p>We present an evaluation of a simulated cosmic ray shower, based on GEANT4 and TOP-C, which tracks all the particles in the shower. TOP-C (Task Oriented Parallel C) provides a framework for parallel algorithm development which makes tractable the problem of following each particle. This method is compared with a simulation program which employs the Hillas thinning algorithm.</p>

	]]>
</description>

<author>L. A. Anchordoqui et al.</author>


</item>






<item>
<title>Geant4 developments and applications</title>
<link>http://works.bepress.com/gcooperman/6</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/6</guid>
<pubDate>Fri, 10 Dec 2010 13:15:28 PST</pubDate>
<description>
	<![CDATA[
	<p>Geant4 is a software toolkit for the simulation of the passage of particles through matter. It is used by a large number of experiments and projects in a variety of application domains, including high energy physics, astrophysics and space science, medical physics and radiation protection. Its functionality and modeling capabilities continue to be extended, while its performance is enhanced. An overview of recent developments in diverse areas of the toolkit is presented. These include performance optimization for complex setups; improvements for the propagation in fields; new options for event biasing; and additions and improvements in geometry, physics processes and interactive capabilities.</p>

	]]>
</description>

<author>J. Allison et al.</author>


</item>






<item>
<title>Corrections to enhanced optical nonlinearity of superlattices</title>
<link>http://works.bepress.com/gcooperman/5</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/5</guid>
<pubDate>Fri, 10 Dec 2010 13:15:27 PST</pubDate>
<description>
	<![CDATA[
	<p>In recent publications, a large enhancement of the third order nonlinear optical susceptibility was predicted for GaAs-GaAIAs superlattices, as a result of the band nonparabolicities introduced by the additional periodicity of the superlattice. These predictions, based on the tight binding model, are here extended to the more realistic Kronig-Penney model. Results show that corrections to tight binding are non-negligible; however, enhancements of χ⁽³⁾ are still large, but reduced by approximately 30%-50% over previous estimates.</p>

	]]>
</description>

<author>G. Cooperman et al.</author>


</item>






<item>
<title>DMTCP: transparent checkpointing for cluster computations and the desktop</title>
<link>http://works.bepress.com/gcooperman/3</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/3</guid>
<pubDate>Fri, 10 Dec 2010 13:15:26 PST</pubDate>
<description>
	<![CDATA[
	<p>DMTCP (Distributed MultiThreaded CheckPointing) is a transparent user-level checkpointing package for distributed applications. Checkpointing and restart is demonstrated for a wide range of over 20 well known applications, including MATLAB, Python, TightVNC, MPICH2, OpenMPI, and runCMS. RunCMS runs as a 680 MB image in memory that includes 540 dynamic libraries, and is used for the CMS experiment of the Large Hadron Collider at CERN. DMTCP transparently checkpoints general cluster computations consisting of many nodes, processes, and threads; as well as typical desktop applications. On 128 distributed cores (32 nodes), checkpoint and restart times are typically 2 seconds, with negligible run-time overhead. Typical checkpoint times are reduced to 0.2 seconds when using forked checkpointing. Experimental results show that checkpoint time remains nearly constant as the number of nodes increases on a medium-size cluster.</p>
<p>DMTCP automatically accounts for fork, exec, ssh, mutexes/semaphores, TCP/IP sockets, UNIX domain sockets, pipes, ptys (pseudo-terminals), terminal modes, ownership of controlling terminals, signal handlers, open file descriptors, shared open file descriptors, I/O (including the readline library), shared memory (via mmap), parent-child process relationships, pid virtualization, and other operating system artifacts. By emphasizing an unprivileged, user-space approach, compatibility is maintained across Linux kernels from 2.6.9 through the current 2.6.28. Since DMTCP is unprivileged and does not require special kernel modules or kernel patches, DMTCP can be incorporated and distributed as a checkpoint-restart module within some larger package.</p>

	]]>
</description>

<author>Jason Ansel et al.</author>


</item>






<item>
<title>Adaptive checkpointing for master-worker style parallelism (extended abstract)</title>
<link>http://works.bepress.com/gcooperman/4</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/4</guid>
<pubDate>Fri, 10 Dec 2010 13:15:26 PST</pubDate>
<description>
	<![CDATA[
	
	]]>
</description>

<author>Gene Cooperman et al.</author>


</item>






<item>
<title>End of year report : FY 2004</title>
<link>http://works.bepress.com/gcooperman/2</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/2</guid>
<pubDate>Fri, 10 Dec 2010 13:15:25 PST</pubDate>
<description>
	<![CDATA[
	
	]]>
</description>

<author>Gene Cooperman et al.</author>


<category>Corporation reports</category>

</item>






<item>
<title>A new current-voltage relation for duct precipitators valid for low and high current densities</title>
<link>http://works.bepress.com/gcooperman/1</link>
<guid isPermaLink="true">http://works.bepress.com/gcooperman/1</guid>
<pubDate>Fri, 10 Dec 2010 13:15:24 PST</pubDate>
<description>
	<![CDATA[
	<p>A closed-form analytic current-voltage formula for duct electrostatic precipitators is presented. A short discussion of previous theoretical and numerical solutions is given, followed by an explanation of the theoretical formula derived here. A comparison with experimental data is then given, showing that the present formula is accurate over a wide range of conditions, including wide plate spacing.</p>

	]]>
</description>

<author>Gene Cooperman</author>


</item>





</channel>
</rss>
