Scientific Software Days 2007

The Texas Advanced Computing Center, the University of Texas Institute for Geophysics, and the University of Texas Bureau of Economic Geology are organizing the 2007 Scientific Software Days, where the scientific community can share experiences of developing software and learn about new developments in general scientific software.

Schedule:

9:00 – 9:15. The computing resources at TACCKarl Schultz

In this presentation, we will briefly introduce the computing resources available at TACC for computational researchers. These resources include a large distributed-memory Linux cluster (Lonestar), an IBM Power5 system (Champion) which supports smaller shared-memory programs, and a Sun E25K system (Maverick) for large scale scientific visualization. Highlights of these systems, along with some select performance indicators will be presented in conjunction with baseline scientific applications that are available.

TACC: http://www.tacc.utexas.edu/

9:15 – 9:45. The PETSc and SLEPc linear algebra librariesVictor Eijkhout

PETSc, the Portable Extendable Toolkit for Scientific Computations, is a large library of parallel linear algebra operations. This includes high level linear and nonlinear system solvers, but also low level operations that allow a user to code new algorithms. There is a profiling interface, error checking and support for parallel debugging, and a large number of examples. While PETSc focuses mostly on iterative linear solvers, various direct solver packages can easily be interfaced. SLEPc, the Scalable Library for Eigenvalue Problem Computations, is a library written on top of PETSc. It offers the same simple interface to a large number of eigenvalue methods, including those of external packages that can be interfaced to SLEPc. In this presentation I will give a brief overview to the capabilities of these two packages, and give the flavour of how they are used in application codes.

PETSc: http://www-unix.mcs.anl.gov/petsc/

SLEPc: http://www.grycap.upv.es/slepc/

9:45 – 10:15. LabVIEW as a platform for scientific computingJim Nagle (National Instruments)

LabVIEW is well known for Data acquisition and Instrument Control, but has been less commonly applied to scientific computing problems. LabVIEW includes libraries that cover many different areas of signal processing and mathematics, and supports several different ”models” of computation, including math script, simulation, state charts, and object oriented programming in addition to LabVIEW’s dataflow language. This talk will cover LabVIEW’s library support and demonstrate the use of math script and dataflow together to solve problems involving nonlinear regression and differential equations.

LabVIEW: http://www.ni.com/labview/

10:15 – 10:30. coffeebreak

10:30 – 11:00. Scientific Computing with PythonEric Jones (Enthought)

Python has emerged as an excellent choice for scientific computing because of its simple syntax, ease of use, and elegant multi-dimensional array arithmetic. Its interpreted evaluation allows it to serve as both the development language and the command line environment in which to explore data. Python also excels as a ”glue” language that bonds algorithms from multiple languages such as C, C++, and Fortran into a single tool – a common need in the scientific arena. In this talk, we will look at a subset of the Open Source Python tools that are commonly used in scientific computing. These include: - NumPy Multi-dimensional array support including powerful indexing and vector math capabilities. - SciPy Algorithms for linear algebra, signal processing, optimization, integration, etc. - Chaco Interactive 2D visualization for plotting scientific data. - Mayavi Interactive 3D visualization based on VTK. - Envisage Plug-in based framework for building scriptable, and extensible applications. Together, this tool suite provides a platform for building application that are simple for the novice to use, yet flexible enough to serve an experts needs. Enthought has used them to build commercial tools in the fields of Electromagnetics, Fluid Dynamics, Geophysics, and Quantitative Financial Analysis. Well demonstrate several of these applications as nice examples of finished applications.

About Enthought

Founded in 2001, Enthought, Inc. is a scientific computing company based in Austin, Texas. They supply solutions in a wide range of fields such as geophysics, fluid dynamics, financial analysis, and electromagnetics and include multiple Fortune 50 companies among their clients. Enthought has a strong commitment to open source software and contributes heavily to its development.

About Eric Jones

Eric Jones has a broad background in engineering and software development and leads Enthought’s product engineering and software design. Prior to co-founding Enthought, Eric worked in the areas of numerical electromagnetics and genetic optimization at the Department of Electrical Engineering at Duke University. He has taught numerous courses on Python and how to use it for scientific computing. He also serves as a member of the Python Software Foundation. Eric holds M.S. and Ph.D. degrees from Duke University in Electrical Engineering and a B.S.E. in Mechanical Engineering from Baylor University.

Enthought: http://www.enthought.com/

SciPy: http://www.scipy.org/

11:00 – 11:30. Introduction to a new hyper-parallel framework for rapidly developing high-performance data-intensive applications on emerging 4/8/16-way commodity SMP multicore systemsJim Falgout and Matt Walker (Pervasive Datarush)

Every organization is wrestling with exploding data volumes, and the performance and runtime pain associated with those ever-increasing volumes. While parallelism is the obvious answer, the hardware platforms to achieve this have historically been bottlenecked, brittle and very expensive. Luckily, there is good news on the horizon - the emerging multicore revolution will bring a new generation of unbelievably cheap and powerful systems - bringing significan’t hardware parallelism within reach of every person or department. Unfortunately, the vast majority of software applications are NOT built to exploit this parallelism. Come hear about DataRush - a completely new (free academic licensing!) 100 systems, hiding the complexity of concurrent programming, and allowing developers to quickly and easily build hyper-parallel data-intensive applications that automatically scale as you add more cores.

DataRush: http://www.pervasivedatarush.com/

11:30 – 12:00. What Is NetBeans?Gregg Sporar (Sun Microsystems)

NetBeans is three things: an award-winning Integrated Development Environment (IDE), a platform for building rich-client applications, and an open-source community. This presentation provides an update on some interesting changes in all three of those areas over the last couple of years. The focus will be on demonstrations of the NetBeans IDE in order to show some of its features for doing Java development, in particular for building applications on top of the NetBeans Platform. Brief demos of the IDE’s support for other programming languages such as C/C++ will also be included. The NetBeans road map will be discussed, including the upcoming support for scripting languages such as JavaScript, Ruby, PHP, and Groovy.

Speaker Bio:

Gregg Sporar has been a software developer for over twenty years, working on projects ranging from control software for a burglar alarm to 3D graphical user interfaces. His interests include user interfaces, development tools, and performance profiling. He works for Sun Microsystems as a Technical Evangelist on the NetBeans project.

NetBeans: http://www.netbeans.org/

12:00 – 1:00. lunch

1:00 – 1:30. Computational Chemistry Software at TACC and The GridChem Computational Chemistry GridChona Guiang

Computational chemistry software is used across the many areas of research that rely on an understanding of molecular structure, properties and function. TACC’s HPC resources include chemistry software that perform different types of molecular modeling and analysis. In this talk, I will give a brief overview of the quantum mechanical software GAMESS and NWChem, as well as the molecular dynamics applications Amber, NAMD and Gromacs. If there is sufficient time and interest, I will also give a live demonstration of GridChem, a desktop-based software for running quantum chemistry applications on a computational grid.

GridChem: https://www.gridchem.org/

1:30 – 2:00. The Madagascar software package and the technology of reproducible computational experimentsSergey Fomel

“Madagascar” is an open-source software project started at UT Austin and publicly released in June 2006. The goal of the project is to provide a convenient computational environment for researchers working with data processing algorithms applied to large datasets. Although the primary usage of the package is in the geophysical community, it contains a number of features, such as a universal data format and universal modules for multidimensional data manipulation, that may make it useful for scientists in other disciplines. The most distinguishing feature is a system for documenting and reproducing computational experiments. The reproducibility feature is implemented with the help of SCons, a Python- based software construction utility. Reproducible experiments serve both as regression tests and as computational recipes exchanged by the users.

Madagascar: http://rsf.sourceforge.net/

2:00 – 2:30. BLAS, LAPACK, and beyondRobert van de Geijn

We give an overview of recently developed Open Source libraries for dense linear algebra operations, including the widely used GotoBLAS and the recently released libFLAME, both products of UT-Austin research products. Performance comparisons with the popular LAPACK library from netlib and vendor scientific library are show these efforts to be highly competitive, particularly on SMP and multi-core architectures.

The discussed libraries are available from http://www.tacc.utexas.edu/recourses/software/

2:30 – 2:45. coffeebreak

2:45 – 3:15. MyCluster, Building personal clusters on demandEdward Walker

This talk will describe MyCluster; a system for building personal clusters on demand. The system is current a production service on the NSF TeraGrid, as well as on the TACC terascale cluster, Lonestar, which is available to the wider University of Texas scientific computing research community. The current production version of MyCluster builds personal Condor clusters for users, via the deployment of semi-autonomous agents at a host cluster site. These semi-autonomous agents are responsible for submitting and managing job proxies through the local scheduler at a site. Job proxies contribute back to the personal cluster when they eventually run, allowing users to submit jobs into an expanding, and shrinking, personal cluster over time. Running jobs in the job proxies also allow us to provide a system call virtualized environment for executing jobs, where additional cluster-wide virtual services are provided. These additional cluster-wide services include a virtual private network, and a wide-area network distributed filesystem. This talk will focus on the long term vision of the MyCluster system, and discuss our efforts in iterating towards this vision.

Bio:

Dr. Edward Walker is currently a research associate and manager of the distributed systems software group at the Texas Advanced Computing Center. Prior to this, he was a senior member of technical staff at Platform Computing Corp, the developers of the LSF (Load Sharing Facility) system. Even prior to that, Dr Walker was a Research Scientist at the National Supercomputing Research Center which was then part of the National University of Singapore. Dr Walker received his PhD from the University of York (UK) in 1996 in the area of parallelizing compiler technologies.

MyCluster: http://www.teragrid.org/userinfo/jobs/gridshell.php

3:15 – 3:45. Eclipse: an Ideal Development Environment.Kent Milfeld

Eclipse is an integrated development environment (IDE) that has enjoyed much success in the Java community for code development. It is portable, extensible, and open-source, and provides a high quality graphical user interface across different platforms. Its initial framework was also designed to be a platform for developing tools that could be easily incorporated into the system as plugins. Eclipse’s popularity has grown, as well as its community contributions. Eclipse now supports Java, C, C++, Fortran, and other languages. Its tools include access to source control systems, project management, code and class navigators, syntax sensitive editors, debugger support, as well as many other feature-rich components. The latest framework provides a Rich Client Platform (RCP) that is even more general. It allows any application to use the framework, interfaces and support, that has been available for the IDE, to build interactive applications with the rich Eclipse feature sets, including the plugin extensions. Some grid architects are even using RCP to build clients for submitting and managing jobs at HPC centers, and including plugins for application monitoring, as well as run-time and post-processing visualization. Recently, the Parallel Tools Platforms (PTP) plugins have integrated parallel application support into Eclipse as an official Eclipse Foundation Technology Project. The layout, operation and utility of Eclipse as an IDE will be presented. Also, an overview of plugins will be given and the capabilities of the Rich Client Platform will be explored.

Eclipse: http://www.eclipse.org/