John Hauser  
2455 Hilgard Avenue #23, Berkeley, CA 94709-1234, USA  / (510) 843-6909
jh@jhauser.us

Objective

  I am interested in difficult development work for low-level computer functions such as operating system kernels, device drivers, graphics rendering, computer arithmetic, and compiler back-ends. I prefer to remain near Berkeley but would consider moving to parts of the U.S. East Coast or Western Europe.

Education

1988 - 2000      University of California / Berkeley, California
Ph.D. in Computer Science (specializing in computer architecture, with a minor in mathematics).
M.S. in Computer Science.
1987 - 1988 University of Colorado / Boulder, Colorado
One year of graduate study.
1982 - 1987 North Carolina State University / Raleigh, North Carolina
B.S. in Computer Engineering (mostly electrical engineering).
B.S. in Computer Science.

Employment

2003 - today 3Plus1 Technology / Saratoga, California
Design and implementation of most of 3Plus1's CoolEngine, plus various programming tools.
1999 - 2004 Berkeley Design Technology, Inc. (BDTI) / Berkeley, California
Senior DSP Engineer (part-time).
1989 - 2000      University of California / Berkeley, California
Various Graduate Student Instructor and Graduate Student Researcher positions.
1994 - 1997 International Computer Science Institute (ICSI) / Berkeley, California
Software programming for ICSI's T0 vector microprocessor.
summer 1992 Silicon Graphics / Mountain View, California
Implementation of quadruple-precision floating-point in software.

Experience

Computer architecture, reconfigurable computing
2003 - 2009      Defined and implemented nearly all of 3Plus1's CoolEngine, a multiprocessor subsystem designed primarily for streaming tasks such as video encoding/decoding. Each CoolEngine processor can execute multiple operations (scalar and SIMD) per clock cycle from a compressed VLIW machine code. A CoolEngine combines several such processors with an intelligent multichannel DMA for I/O and DRAM access. Was responsible for every aspect of the CoolEngine processors, including the programming architecture, instruction caches, instruction fetch, decode, and dispatch units, datapaths, and all functional units; plus the DMA unit and other CoolEngine components. Implementation was in Verilog and a proprietary langauge (machine-translated to Verilog). Physical area and timing were optimized using feedback from Synopsys standard-cell synthesis.
2002 At BDTI, advised a major processor design company of specific SIMD-style extensions that could be made to an existing architecture to improve its performance for fixed-point digital signal processing. (Examples of SIMD features of other processors include MMX/SSE on the Pentium and the Altivec extensions of the PowerPC architecture.)
1994 - 2000 For my doctoral dissertation, examined the use of an FPGA-like device as an additional microprocessor functional unit. (FPGAs are field-programmable gate arrays.) Defined a novel processor architecture named Garp, and constructed programming tools and a cycle-accurate simulator. Implementation feasibility was studied through SPICE circuit simulations and partial VLSI layout. By rewriting programs to use the new functional unit and executing them with the simulator, performance was compared favorably against a commercial superscalar processor for a set of real applications. My research advisor was John Wawrzynek.
Programming languages, compilers, other software development tools
2006 - 2008 Defined and implemented libraries and related software tools for interprocessor communication within 3Plus1's CoolEngine multiprocessor subsystem.
2004 - 2008 Provided a practical alternative platform for testing and debugging software for 3Plus1's CoolEngine by creating tools and libraries to allow CoolEngine source code (in C and assembly language) to be compiled and linked to run efficiently on an ordinary desktop computer. In this foreign environment, the CoolEngine's multiple processors are simulated by transparently spawning multiple processes, and CoolEngine assembly code is handled by transparently invoking a CoolEngine processor simulator whenever an assembly-language function is called.
2004 - 2005 Invented the complex assembly language of 3Plus1's CoolEngine processors, and created the first assembler.
2000 - 2003 Contributed in major part to the ISO Technical Report on extensions to Standard C for embedded microcomputers, TR 18037 [late draft, PDF, 365 kB]. Was the author of clause 5 (named address spaces and registers) and much of clauses 4 and 6 (fixed-point arithmetic, I/O addressing).
1996 - 1997 Built a C preprocessor as part of a project to construct a complete ISO-Standard C compiler.
1995 Implemented a basic-block instruction scheduler within the GNU assembler (gas) for ICSI's T0 vector microprocessor. The T0 processor is unusually difficult to do scheduling for because structural hazards can be held by the vector instructions for literally dozens of clock cycles. Several heuristics were combined to cover different situations. Almost no noticeable execution time was added to gas, even for artificially generated basic blocks containing many hundreds of instructions.
1991 - 1994 For my Master's degree, examined the need for exception-handling features in programming languages, and critiqued the main kinds of exception mechanisms that have been implemented or proposed over the years. Special attention was given to the efficient handling of arithmetic exceptions on high-speed processors. My Master's work was supervised by Profs. Sue Graham and W. Kahan.
Computer arithmetic
2001 Created a C++ library for BDTI that fully implements parameterized fixed-point types. Fixed-point formats are specified using C++ template parameters, and the standard arithmetic operators, +, -, *, etc., are overloaded to permit arbitrary fixed-point operands in addition to the usual integers and floating-point. Precise control over rounding and overflow allows bit-identical mimicking of most fixed-point hardware.
1996 Made the first release of the SoftFloat and TestFloat software packages available on the Web. Both are grown out of work originally done for ICSI (see below). At the time of its release, TestFloat found small flaws in the floating-point of several commercial processors, including a flaw in the Intel Pentium Pro that was rediscovered the next year and dubbed Dan-0411 by Robert Collins of the Intel Secrets Home Page.
1994 - 1995 Implemented floating-point and other arithmetic functions for ICSI's T0 vector microprocessor. Functions coded include single- and double-precision IEEE floating-point emulation written in C, and a vector version of single-precision IEEE floating-point written in T0 assembly language.
1992 At Silicon Graphics, coded IEEE-compliant quadruple-precision floating-point in MIPS assembly language.
1992 Discovered an oversight in Digital Equipment Corporation's Alpha architecture concerning floating-point subnormal numbers. The discovery resulted in a last-minute fix by Digital before the first Alpha machines were shipped.
Digital signal processing (DSP)
2003 - 2009 Optimized the processors of 3Plus1's CoolEngine for many common DSP functions (such as FFT).
1999 - 2009 For various BDTI customers, defined and coded numerous DSP functions, and also improved the performance of several DSP applications through profiling and the recoding of critical functions, usually in hand-optimized assembly language. Speed improvements in some cases were as much as a factor of ten. For 3Plus1, performed the same services to demonstrate the CoolEngine's capabilites.
1999 - 2003 At BDTI, helped evaluate the DSP performance of a number of processors, either with respect to specific customer needs or in accordance with BDTI's proprietary benchmarking methodology. Participated in ongoing efforts within BDTI to refine and extend the company's benchmarking methods.
2001 Together with a colleague at BDTI, converted a customer-supplied software MP3 decoder entirely from floating-point to fixed-point in order to port the decoder to processors supporting only integer operations in hardware.
Other areas
1996 Created a Solaris device driver to interface with ICSI's SPERT-II board, an SBus daughter card containing a T0 processor.
1989 Helped crack the encoding of Adobe Type-1 fonts, before the format was publicly documented by Adobe. My contributions included expanding the set of known byte-code operators and deducing much of the font hinting mechanism.
I have also done some unpublished research on geometric interpolation (splines), on image scaling (changing size and/or resolution), and on dithering color images to a limited palette.

Computer Skills

Operating systems: Microsoft Windows, Linux, Solaris, MS-DOS, some Mac OS, Mac OS X.
Programming APIs/libraries: Win32, POSIX.1, Single UNIX Specification, Linux, System V, Solaris DDI/DDK (for device drivers), UNIX terminfo database, TWAIN (interface to scanners/cameras on PCs), MS-DOS.
Programming languages: C, some C++, Common LISP, miscellaneous others.
Hardware description languages: Verilog.
Processor architectures and assembly languages: Intel 80x86/IA32 (with MMX), MIPS, Motorola 68000, some SPARC, PowerPC (with Altivec extensions), several DSPs, others.
Compiler development tools: lex/flex, yacc/bison.
Document definition languages: LaTeX, TeX, some HTML, PostScript.

Publications

``The SFRA: A Corner-Turn FPGA Architecture.'' Nicholas Weaver, John Hauser, and John Wawrzynek. In Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field-Programmable Gate Arrays (FPGA '04, February 2004).
``The Garp Architecture and C Compiler.'' Timothy J. Callahan, John R. Hauser, and John Wawrzynek. Computer 33:4 (April 2000).
``A Fixed-Point Recursive Digital Oscillator for Additive Synthesis.'' Todd Hodes, John Hauser, John Wawrzynek, Adrian Freed, and David Wessel. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '99, March 1999).
``Garp: A MIPS Processor with a Reconfigurable Coprocessor.'' John R. Hauser and John Wawrzynek. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '97, April 1997).
``Handling Floating-Point Exceptions in Numeric Programs.'' John R. Hauser. ACM Transactions on Programming Languages and Systems 18:2 (March 1996).

Activities, Awards

2003 - 2004 Participant in the working group revising the IEEE standard for floating-point arithmetic (IEEE 754).
2000 - 2002 BDTI's representative in INCITS Technical Committee J11, the U.S.'s technical advisory group within ISO/IEC JTC1/SC22/WG14, which is the ISO working group responsible for Standard C.
1994 - 1998      Caretaker for the informal library of U.C. Berkeley's Computer Science Division.
1992 - 1997 Computer Science Division delegate to U.C. Berkeley's Graduate Assembly.
1988 - 1991 National Science Foundation Graduate Fellowship.

John Hauser, 2009 September 23