Scientific Software Days 2007
The Texas Advanced Computing Center, the University of Texas Institute for Geophysics, and the University of Texas Bureau of Economic Geology are organizing the 2007 Scientific Software Days, where the scientific community can share experiences of developing software and learn about new developments in general scientific software.
9:00 – 9:15. The computing resources at TACC – Karl Schultz
In this presentation, we will brieﬂy introduce the computing resources available at TACC for computational researchers. These resources include a large distributed-memory Linux cluster (Lonestar), an IBM Power5 system (Champion) which supports smaller shared-memory programs, and a Sun E25K system (Maverick) for large scale scientiﬁc visualization. Highlights of these systems, along with some select performance indicators will be presented in conjunction with baseline scientiﬁc applications that are available.
9:15 – 9:45. The PETSc and SLEPc linear algebra libraries – Victor Eijkhout
PETSc, the Portable Extendable Toolkit for Scientiﬁc Computations, is a large library of parallel linear algebra operations. This includes high level linear and nonlinear system solvers, but also low level operations that allow a user to code new algorithms. There is a proﬁling interface, error checking and support for parallel debugging, and a large number of examples. While PETSc focuses mostly on iterative linear solvers, various direct solver packages can easily be interfaced. SLEPc, the Scalable Library for Eigenvalue Problem Computations, is a library written on top of PETSc. It oﬀers the same simple interface to a large number of eigenvalue methods, including those of external packages that can be interfaced to SLEPc. In this presentation I will give a brief overview to the capabilities of these two packages, and give the ﬂavour of how they are used in application codes.
9:45 – 10:15. LabVIEW as a platform for scientiﬁc computing – Jim Nagle (National Instruments)
LabVIEW is well known for Data acquisition and Instrument Control, but has been less commonly applied to scientiﬁc computing problems. LabVIEW includes libraries that cover many diﬀerent areas of signal processing and mathematics, and supports several diﬀerent ”models” of computation, including math script, simulation, state charts, and object oriented programming in addition to LabVIEW’s dataﬂow language. This talk will cover LabVIEW’s library support and demonstrate the use of math script and dataﬂow together to solve problems involving nonlinear regression and diﬀerential equations.
10:15 – 10:30. coﬀeebreak
10:30 – 11:00. Scientific Computing with Python – Eric Jones (Enthought)
Python has emerged as an excellent choice for scientiﬁc computing because of its simple syntax, ease of use, and elegant multi-dimensional array arithmetic. Its interpreted evaluation allows it to serve as both the development language and the command line environment in which to explore data. Python also excels as a ”glue” language that bonds algorithms from multiple languages such as C, C++, and Fortran into a single tool – a common need in the scientiﬁc arena. In this talk, we will look at a subset of the Open Source Python tools that are commonly used in scientiﬁc computing. These include: - NumPy Multi-dimensional array support including powerful indexing and vector math capabilities. - SciPy Algorithms for linear algebra, signal processing, optimization, integration, etc. - Chaco Interactive 2D visualization for plotting scientiﬁc data. - Mayavi Interactive 3D visualization based on VTK. - Envisage Plug-in based framework for building scriptable, and extensible applications. Together, this tool suite provides a platform for building application that are simple for the novice to use, yet ﬂexible enough to serve an experts needs. Enthought has used them to build commercial tools in the ﬁelds of Electromagnetics, Fluid Dynamics, Geophysics, and Quantitative Financial Analysis. Well demonstrate several of these applications as nice examples of ﬁnished applications.
Founded in 2001, Enthought, Inc. is a scientiﬁc computing company based in Austin, Texas. They supply solutions in a wide range of ﬁelds such as geophysics, ﬂuid dynamics, ﬁnancial analysis, and electromagnetics and include multiple Fortune 50 companies among their clients. Enthought has a strong commitment to open source software and contributes heavily to its development.
About Eric Jones
Eric Jones has a broad background in engineering and software development and leads Enthought’s product engineering and software design. Prior to co-founding Enthought, Eric worked in the areas of numerical electromagnetics and genetic optimization at the Department of Electrical Engineering at Duke University. He has taught numerous courses on Python and how to use it for scientiﬁc computing. He also serves as a member of the Python Software Foundation. Eric holds M.S. and Ph.D. degrees from Duke University in Electrical Engineering and a B.S.E. in Mechanical Engineering from Baylor University.
11:00 – 11:30. Introduction to a new hyper-parallel framework for rapidly developing high-performance data-intensive applications on emerging 4/8/16-way commodity SMP multicore systems – Jim Falgout and Matt Walker (Pervasive Datarush)
Every organization is wrestling with exploding data volumes, and the performance and runtime pain associated with those ever-increasing volumes. While parallelism is the obvious answer, the hardware platforms to achieve this have historically been bottlenecked, brittle and very expensive. Luckily, there is good news on the horizon - the emerging multicore revolution will bring a new generation of unbelievably cheap and powerful systems - bringing signiﬁcan’t hardware parallelism within reach of every person or department. Unfortunately, the vast majority of software applications are NOT built to exploit this parallelism. Come hear about DataRush - a completely new (free academic licensing!) 100 systems, hiding the complexity of concurrent programming, and allowing developers to quickly and easily build hyper-parallel data-intensive applications that automatically scale as you add more cores.
11:30 – 12:00. What Is NetBeans? – Gregg Sporar (Sun Microsystems)
Gregg Sporar has been a software developer for over twenty years, working on projects ranging from control software for a burglar alarm to 3D graphical user interfaces. His interests include user interfaces, development tools, and performance proﬁling. He works for Sun Microsystems as a Technical Evangelist on the NetBeans project.
12:00 – 1:00. lunch
1:00 – 1:30. Computational Chemistry Software at TACC and The GridChem Computational Chemistry Grid – Chona Guiang
Computational chemistry software is used across the many areas of research that rely on an understanding of molecular structure, properties and function. TACC’s HPC resources include chemistry software that perform diﬀerent types of molecular modeling and analysis. In this talk, I will give a brief overview of the quantum mechanical software GAMESS and NWChem, as well as the molecular dynamics applications Amber, NAMD and Gromacs. If there is suﬃcient time and interest, I will also give a live demonstration of GridChem, a desktop-based software for running quantum chemistry applications on a computational grid.
1:30 – 2:00. The Madagascar software package and the technology of reproducible computational experiments – Sergey Fomel
“Madagascar” is an open-source software project started at UT Austin and publicly released in June 2006. The goal of the project is to provide a convenient computational environment for researchers working with data processing algorithms applied to large datasets. Although the primary usage of the package is in the geophysical community, it contains a number of features, such as a universal data format and universal modules for multidimensional data manipulation, that may make it useful for scientists in other disciplines. The most distinguishing feature is a system for documenting and reproducing computational experiments. The reproducibility feature is implemented with the help of SCons, a Python- based software construction utility. Reproducible experiments serve both as regression tests and as computational recipes exchanged by the users.
2:00 – 2:30. BLAS, LAPACK, and beyond – Robert van de Geijn
We give an overview of recently developed Open Source libraries for dense linear algebra operations, including the widely used GotoBLAS and the recently released libFLAME, both products of UT-Austin research products. Performance comparisons with the popular LAPACK library from netlib and vendor scientiﬁc library are show these eﬀorts to be highly competitive, particularly on SMP and multi-core architectures.
The discussed libraries are available from http://www.tacc.utexas.edu/recourses/software/
2:30 – 2:45. coﬀeebreak
2:45 – 3:15. MyCluster, Building personal clusters on demand – Edward Walker
This talk will describe MyCluster; a system for building personal clusters on demand. The system is current a production service on the NSF TeraGrid, as well as on the TACC terascale cluster, Lonestar, which is available to the wider University of Texas scientiﬁc computing research community. The current production version of MyCluster builds personal Condor clusters for users, via the deployment of semi-autonomous agents at a host cluster site. These semi-autonomous agents are responsible for submitting and managing job proxies through the local scheduler at a site. Job proxies contribute back to the personal cluster when they eventually run, allowing users to submit jobs into an expanding, and shrinking, personal cluster over time. Running jobs in the job proxies also allow us to provide a system call virtualized environment for executing jobs, where additional cluster-wide virtual services are provided. These additional cluster-wide services include a virtual private network, and a wide-area network distributed ﬁlesystem. This talk will focus on the long term vision of the MyCluster system, and discuss our eﬀorts in iterating towards this vision.
Dr. Edward Walker is currently a research associate and manager of the distributed systems software group at the Texas Advanced Computing Center. Prior to this, he was a senior member of technical staﬀ at Platform Computing Corp, the developers of the LSF (Load Sharing Facility) system. Even prior to that, Dr Walker was a Research Scientist at the National Supercomputing Research Center which was then part of the National University of Singapore. Dr Walker received his PhD from the University of York (UK) in 1996 in the area of parallelizing compiler technologies.
3:15 – 3:45. Eclipse: an Ideal Development Environment. – Kent Milfeld
Eclipse is an integrated development environment (IDE) that has enjoyed much success in the Java community for code development. It is portable, extensible, and open-source, and provides a high quality graphical user interface across diﬀerent platforms. Its initial framework was also designed to be a platform for developing tools that could be easily incorporated into the system as plugins. Eclipse’s popularity has grown, as well as its community contributions. Eclipse now supports Java, C, C++, Fortran, and other languages. Its tools include access to source control systems, project management, code and class navigators, syntax sensitive editors, debugger support, as well as many other feature-rich components. The latest framework provides a Rich Client Platform (RCP) that is even more general. It allows any application to use the framework, interfaces and support, that has been available for the IDE, to build interactive applications with the rich Eclipse feature sets, including the plugin extensions. Some grid architects are even using RCP to build clients for submitting and managing jobs at HPC centers, and including plugins for application monitoring, as well as run-time and post-processing visualization. Recently, the Parallel Tools Platforms (PTP) plugins have integrated parallel application support into Eclipse as an official Eclipse Foundation Technology Project. The layout, operation and utility of Eclipse as an IDE will be presented. Also, an overview of plugins will be given and the capabilities of the Rich Client Platform will be explored.