2455 Hilgard Avenue #23, Berkeley, jh@jhauser.us
|
|
I am interested in difficult development work for low-level computer functions
such as operating system kernels, device drivers, graphics rendering, computer
arithmetic, and compiler back-ends.
|
| 1988 - 2000 |
University of California Ph.D. in Computer Science (specializing in computer architecture, with a minor in mathematics). M.S. in Computer Science. |
|
| 1987 - 1988 |
University of Colorado One year of graduate study. |
|
| 1982 - 1987 |
North Carolina State University B.S. in Computer Engineering (mostly electrical engineering). B.S. in Computer Science. |
|
| 2003 - today |
3Plus1 Technology Design and implementation of most of 3Plus1's CoolEngine, plus various programming tools. |
| 1999 - 2004 |
Berkeley Design Technology, Inc. (BDTI) Senior DSP Engineer (part-time). |
| 1989 - 2000 |
University of California Various Graduate Student Instructor and Graduate Student Researcher positions. |
| 1994 - 1997 |
International Computer Science Institute (ICSI) Software programming for ICSI's |
| summer 1992 |
Silicon Graphics Implementation of quadruple-precision floating-point in software. |
| Computer architecture, reconfigurable computing | |
|---|---|
| 2003 - 2009 | Defined and implemented nearly all of 3Plus1's CoolEngine, a multiprocessor subsystem designed primarily for streaming tasks such as video encoding/decoding. Each CoolEngine processor can execute multiple operations (scalar and SIMD) per clock cycle from a compressed VLIW machine code. A CoolEngine combines several such processors with an intelligent multichannel DMA for I/O and DRAM access. Was responsible for every aspect of the CoolEngine processors, including the programming architecture, instruction caches, instruction fetch, decode, and dispatch units, datapaths, and all functional units; plus the DMA unit and other CoolEngine components. Implementation was in Verilog and a proprietary langauge (machine-translated to Verilog). Physical area and timing were optimized using feedback from Synopsys standard-cell synthesis. |
| 2002 | At BDTI, advised a major processor design company of specific SIMD-style extensions that could be made to an existing architecture to improve its performance for fixed-point digital signal processing. (Examples of SIMD features of other processors include MMX/SSE on the Pentium and the Altivec extensions of the PowerPC architecture.) |
| 1994 - 2000 | For my doctoral dissertation, examined the use of an FPGA-like device as an additional microprocessor functional unit. (FPGAs are field-programmable gate arrays.) Defined a novel processor architecture named Garp, and constructed programming tools and a cycle-accurate simulator. Implementation feasibility was studied through SPICE circuit simulations and partial VLSI layout. By rewriting programs to use the new functional unit and executing them with the simulator, performance was compared favorably against a commercial superscalar processor for a set of real applications. My research advisor was John Wawrzynek. |
| Programming languages, compilers, other software development tools | |
| 2006 - 2008 | Defined and implemented libraries and related software tools for interprocessor communication within 3Plus1's CoolEngine multiprocessor subsystem. |
| 2004 - 2008 | Provided a practical alternative platform for testing and debugging software for 3Plus1's CoolEngine by creating tools and libraries to allow CoolEngine source code (in C and assembly language) to be compiled and linked to run efficiently on an ordinary desktop computer. In this foreign environment, the CoolEngine's multiple processors are simulated by transparently spawning multiple processes, and CoolEngine assembly code is handled by transparently invoking a CoolEngine processor simulator whenever an assembly-language function is called. |
| 2004 - 2005 | Invented the complex assembly language of 3Plus1's CoolEngine processors, and created the first assembler. |
| 2000 - 2003 |
Contributed in major part to the ISO
Technical Report on extensions to |
| 1996 - 1997 |
Built a |
| 1995 |
Implemented a basic-block instruction scheduler within the GNU assembler
(gas) for ICSI's
gas, even
for artificially generated basic blocks containing many hundreds of
instructions.
|
| 1991 - 1994 |
For my Master's degree, examined the need for exception-handling features in
programming languages, and critiqued the main kinds of exception mechanisms
that have been implemented or proposed over the years.
Special attention was given to the efficient handling of arithmetic
exceptions on high-speed processors.
My Master's work was supervised by Profs.
Sue Graham and
|
| Computer arithmetic | |
| 2001 |
Created a C++ library for BDTI that fully
implements parameterized fixed-point types.
Fixed-point formats are specified using C++ template parameters, and the
standard arithmetic operators, +, -, *,
etc., are overloaded to permit arbitrary fixed-point operands in addition to
the usual integers and floating-point.
Precise control over rounding and overflow allows bit-identical mimicking of
most fixed-point hardware.
|
| 1996 |
Made the first release of the
SoftFloat and
TestFloat
software packages available on the Web.
Both are grown out of work originally done for ICSI (see below).
At the time of its release, TestFloat found small flaws in the floating-point
of several commercial processors, including a flaw in the Intel Pentium Pro
that was rediscovered the next year and dubbed
|
| 1994 - 1995 |
Implemented floating-point and other arithmetic functions for ICSI's
|
| 1992 | At Silicon Graphics, coded IEEE-compliant quadruple-precision floating-point in MIPS assembly language. |
| 1992 | Discovered an oversight in Digital Equipment Corporation's Alpha architecture concerning floating-point subnormal numbers. The discovery resulted in a last-minute fix by Digital before the first Alpha machines were shipped. |
| Digital signal processing (DSP) | |
| 2003 - 2009 | Optimized the processors of 3Plus1's CoolEngine for many common DSP functions (such as FFT). |
| 1999 - 2009 | For various BDTI customers, defined and coded numerous DSP functions, and also improved the performance of several DSP applications through profiling and the recoding of critical functions, usually in hand-optimized assembly language. Speed improvements in some cases were as much as a factor of ten. For 3Plus1, performed the same services to demonstrate the CoolEngine's capabilites. |
| 1999 - 2003 | At BDTI, helped evaluate the DSP performance of a number of processors, either with respect to specific customer needs or in accordance with BDTI's proprietary benchmarking methodology. Participated in ongoing efforts within BDTI to refine and extend the company's benchmarking methods. |
| 2001 | Together with a colleague at BDTI, converted a customer-supplied software MP3 decoder entirely from floating-point to fixed-point in order to port the decoder to processors supporting only integer operations in hardware. |
| Other areas | |
| 1996 |
Created a Solaris device driver to interface with ICSI's
|
| 1989 | Helped crack the encoding of Adobe Type-1 fonts, before the format was publicly documented by Adobe. My contributions included expanding the set of known byte-code operators and deducing much of the font hinting mechanism. |
| I have also done some unpublished research on geometric interpolation (splines), on image scaling (changing size and/or resolution), and on dithering color images to a limited palette. | |
| Operating systems: |
Microsoft Windows, Linux, Solaris, |
|
| Programming APIs/libraries: |
Win32, POSIX.1, Single UNIX Specification, Linux, terminfo database,
TWAIN (interface to scanners/cameras on PCs), |
|
| Programming languages: | C, |
|
| Hardware description languages: | Verilog. | |
| Processor architectures and assembly languages: |
Intel 80x86/IA32 (with MMX), MIPS, Motorola 68000, |
|
| Compiler development tools: |
lex/flex, yacc/bison.
|
|
| Document definition languages: | LaTeX, TeX, |
|
``The SFRA: A Corner-Turn FPGA Architecture.''
Nicholas Weaver, John Hauser, and John Wawrzynek.
In
Proceedings of the 2004 ACM/SIGDA 12th International Symposium on
Field-Programmable Gate Arrays
( |
|
``The Garp Architecture and |
|
``A Fixed-Point Recursive Digital Oscillator for Additive Synthesis.''
Todd Hodes, John Hauser, John Wawrzynek, Adrian Freed, and David Wessel.
In
Proceedings of the IEEE International Conference on Acoustics, Speech,
and Signal Processing
( |
|
``Garp: A MIPS Processor with a Reconfigurable Coprocessor.''
|
|
``Handling Floating-Point Exceptions in Numeric Programs.''
|
| 2003 - 2004 |
Participant in the
working group
revising the IEEE standard for floating-point arithmetic
( |
| 2000 - 2002 |
BDTI's representative in
INCITS Technical Committee J11,
the U.S.'s technical advisory group within
ISO/IEC JTC1/SC22/WG14,
which is the ISO working group responsible
for |
| 1994 - 1998 |
Caretaker for the informal library of |
| 1992 - 1997 |
Computer Science Division delegate to |
| 1988 - 1991 | National Science Foundation Graduate Fellowship. |