The ability to…
learn abstract concepts quickly without being formally taught (autodidacticism).
construct mathematical models and other logical abstractions.
work independently without supervision or external motivators.
clearly communicate abstract concepts and nuanced technical material to a diverse audience.
prioritize and schedule tasks to optimize productivity.
see the big picture, rather than just the details of a system (i.e., the forest and the trees).
quickly identify the root causes of technical problems.
quickly learn programming languages, APIs, frameworks, and tools.
Examples: I thought myself tensor calculus, complex analysis, galaxy dynamics, galaxy simulation techniques, as well as every programming language, API, and software application that I know.
A large part of my formal education and research has involved the construction of mathematical models. By “other logical abstractions,” I am referring to things like physical analogs and object-oriented and functional programming abstractions.
Examples: (1) As a graduate student, I was the only person on my campus working in the field of galaxy simulation; I designed my dissertation project and performed the research independently with minimal (primarily bureaucracy-related) guidance from my advisor. (2) I have been able to solve every programming issue that I’ve ever encountered, by reading source code and documentation, performing experiments (debugging), examining logs, and searching the Web.
I have experience teaching and tutoring high school students, liberal arts majors, science majors, a computer engineering major, physics graduate students, and elementary school teachers. Additionally, I’ve written proposals targeting a variety of audiences.
This is a skill that has benefited me when developing software, teaching, leading the systems engineering team, and getting work done in general; knowing what tasks to prioritize and how to schedule things efficiently. Some tasks obviously must be performed before others, others can be performed in different orders, but some ways of ordering tasks may be more efficient than others, even when there are no direct resource conflicts. For example, redundant operations can sometimes be eliminated. Also, some tasks can be performed concurrently to improve productivity, but not all tasks that can be done concurrently should necessarily be done concurrently. Sometimes, the order in which things are done can prevent members of a group from becoming confused; avoiding confusion / improving understanding among members of a group can be extremely beneficial.
My natural tendency is to think about systems as a whole. Over time, though, I have learned to enjoy thinking about smaller details and the interactions among the details which lead to complexities. When developing software from the ground up or analyzing data, one needs to be able to understand fine details while keeping things in perspective.
I am able to quickly diagnose technical problems because (1) I pay attention to small details, (2) I have a rather broad knowledge base, and (3) I have experience diagnosing problems; new problems are usually just variations of problems that I have already encountered.
I can typically learn the basics of a programming language, API, or tool and begin using the new technology to write simple programs in less than a day. Depending on the complexity, fluency requires days or weeks of full-time usage and experimentation.
Skills
OOP
Algorithm Design
Numerical Simulation
Multithreading
Optimization
Research
Numerical Analysis
Data Analysis
Data Visualization
Image Processing
3D Mathematics
Physics
Astronomy
Statistics
Teaching
I have been writing and using object-oriented code since I began learning C++ (circa 2004). However, I only started being serious about OOP design principles in 2010. Much of my Python code is also object oriented, to the extent that Python supports object orientation (nothing is truly private in Python classes, for example).
Designing and analyzing algorithms is fun. I was formally introduced to the topic in my numerical algorithms and computational physics courses as an undergraduate. I learned more by reading computer science textbooks and articles, working on independent projects, participating in graduate courses, and working on research projects.
My Ph.D. and master’s degree research both involved modeling and simulation. In addition to studying simulation methods and implementing those methods in code, I taught a graduate seminar in computational techniques, including simulation techniques. See the projects page for some examples.
I am familiar with the simple form of multithreading, made possible with OpenMP as well as the more flexible form, which involves mutexes, condition variables, futures, and promises.
Software Optimization: In addition to multithreading and multiprocessing, I am familiar with cache optimization, profiling, and the explicit use of SIMD instructions on x86-64 CPUs (via compiler intrinsics). I have experience simplifying and re-designing algorithms, introducing more efficient data structures, and using approximations when they suffice. Computational / Mathematical Optimization: I am familiar with several methods of minimizing and maximizing an objective function, given constraints.
I have approximately ten years of formal research experience and I have been doing informal research for most of my life. While working on my Ph.D., I was formally recognized as the ‘best graduate student researcher’ in my class—twice.
Numerical analysis is used whenever continuous quantities need to be approximated using the discrete mathematical operations available on a digital computer. It’s also often used when mathematical operations that are not part of the computer’s instruction set need to be implemented in software. I was first introduced to the field as a sophomore in college and I have used the techniques frequently ever since.
As a scientific researcher, much of my formal training and my research has involved data analysis and interpretation. Much of my work as a software developer has also involved analyzing data—including the performance characteristics of my code. Managing a data center has also involved analysis of data related system performance and usage statistics.
Visualizing data is one of the easiest ways to extract insights and communicate results with others. Finding interesting ways to view data is one of the primary components of astrophysics research.
I’ve been trained in astronomical image processing and I am an amateur (digital) photographer who enjoys experimenting.
I have studied and used tensor calculus, vector calculus, complex analysis, ordinary and partial differential equations, linear algebra, and Fourier analysis extensively in graduate physics courses and in research. So, I am familiar with N-dimensional mathematics with arbitrary metrics, rather than just 3D Math.
I hold a terminal degree (Ph.D.) in physics. In addition to doing research in physics, I have taught physics at the high school and university levels. See the Education page for more details.
My Ph.D. dissertation research was in galactic astrophysics, which is a field that requires a strong foundation in astronomy. My master’s degree research involved space weather modeling, which requires an understanding of astronomy on a much smaller scale.
In addition to the statistics that all physicists and astronomers are required to learn in order to do basic research, I have studied probability and statistics independently. Notably, I’ve studied Bayesian statistics, which is quite important in machine learning and data mining.
Teaching is one of my passions. I began tutoring calculus, physics, and astronomy in 2003. In 2005, I began teaching high school physics at one of the top-ranked public high schools in the US. Then, in 2011, I taught university physics for the first time. I’ve also taught basic astronomy and physics to elementary and middle school teachers, seeking a subject area (re-)certification. I’ve also taught graduate students, postdocs, and coworkers, on occasion.
C++11
Python
Bash / Shell
Rust
HTML5
CSS3
JavaScript
Scheme / Guile
$\rm \LaTeX$
C++ is one of my primary programming languages (the other is Python). I like the language because of its speed and versatility, although it has accumulated some baggage over time. I wrote my first C++ programs in 2004. It was my primary programming language from 2009 – 2016. Most of my C++ projects have been primarily written in C++11. I’ve started using some newer C++20 features of the language recently (as of 2025).
I began using Python for data analysis, plotting, and scripting in 2010. In 2014, I used Boost::python
for the first time to integrate C++ and Python (now I generally use pybind11 for that purpose). Most of the software I’ve developed since 2017 has been Python code (ever since I started developing software for processing data from the Euclid Space Telescope).
Bash scripts are quite handy; I have written 2 – 20 Bash scripts per month, since early 2008. Lately, I have been using ChatGPT to do Bash scripting for me, since it is so quick; I just tweak the output. I’m fairly familiar with the language, though.
I learned basic ECMAScript in 2013 and I adopted it as the scripting language for my GSnap
project. In 2014, I learned more about how to use JavaScript in web pages, along with the Document Object Model (DOM) when I implemented Pretty Parametric Plots. I generally dislike the language. So I haven’t done much with it and I’ve never tried producing professional-quality JavaScript. If I had to do web development as part of my primary job, I’d much prefer to write TypeScript.
I learned the fundamentals of HTML5 in early 2014 when I began working on Pretty Parametric Plots. I’ve used it when creating websites and when developing
nrstatic
.
I learned CSS3 in early 2014 when I learned JavaScript and HTML5, since the three languages are obviously a triad.
Rust is a rather well-designed, modern language with a compiler that produces very efficient and safe / secure code. Rust clearly benefits from the lessons learned from decades of C++ programming. It incorporates a standardized build system and package manager and encourages a uniform coding style. I have been watching Rust and its ecosystem develop since 2012. As of 2025, I am beginning to actually use Rust. I am planning to use Slint to create my first Rust-based GUI application.
In 2014, I learned the basics of Scheme in order to gain experience with a functional programming language. I chose the Guile implementation of Scheme because it is the official extension language for GNU. While it is a very interesting language, I haven’t yet used it for anything other than satisfying my own curiosity.
I learned $\rm\LaTeX$ in 2007. Nearly all of my Ph.D. coursework was typeset in $\rm\LaTeX$. Every proposal, paper, formal letter, and presentation slides that I’ve written since 2007 has been typeset with $\rm\LaTeX$. My bibliographies have used $\rm Bib\TeX$. In 2013, I learned the underlying $\rm\TeX$ (by reading Knuth’s detailed book) in order to customize class files.
& Libraries
Qt Framework
NumPy
Matplotlib
Numba
SciPy
Astropy
OpenMP
OpenCL
pybind11
ZeroMQ
pytest
GoogleTest
Since 2010, I have used many GUI and non-GUI features of the Qt Framework, including the Graphics View framework and QScript. I’ve primarily used Qt with C++, but I’ve also used it in Python, with pyqt5.
I have used NumPy (and the rest of SciPy) since 2010 when I began using Python. In addition to the built-in NumPy and SciPy routines, I’ve used NumPy arrays within C++ code, via pybind11, and I’ve used the Numba JIT compiler to perform operations on NumPy arrays efficiently. I’ve also used NumPy along with PyOpenCL.
I have used Matplotlib for most of my plotting needs since 2010. It is my go-to plotting solution. I even incorporated it into my PyQt5-based GUI inSpector application.
Numba is a great Python package for accelerating computations without resorting to writing C or C++ code manually. For example, I used the Numba just-in-time compiler to significantly speed up the Euclid spectra decomtanination module by replacing a few non-optimal SciPy and Shapely functions with Numba-accelerated hand-coded functions.
I’ve been using many features from SciPy since 2010, when I transitioned from using Matlab/Octave for interactive, exploratory scientific computing to using Python.
Astropy is a Python package that is mostly intended for sue by astronomers, but it has many nice features that make it useful for data manipulation and science in general. The tables
and units
are particularly nice.
I have used OpenMP to parallelize computationally expensive C++ code since 2009. It is one of the easiest ways to parallelize code. If the OpenMP-aware version of Numba ever becomes part of the standard Numba release, I’ll likely use OpenMP within Python as well.
I have used OpenCL alongside C++ and within Python, using PyOpenCL. For instance, I used it to drastically accelerate a 2D image convolution code. Unfortunately, most of the code that I have developed for work has no guarantee of having access to OpenCL. So I haven’t used OpenCL as much as I would if it was ubiquitous.
As far as I am aware, pybind11 is overall the best solution for providing an interface between C++ and Python. I have not used it in any of my projects for work, I have experimented with it quite a bit and I used its predecessor, Boost.Python, in my Nebulos project.
ZeroMQ is an elegant, lightweight solution for general asynchronous inter-process communication which works with many languages. It requires no separate message broker to handle messages. Coupled with an efficient serialization library, like Messagepack, it’s a very powerful, yet lightweight tool for developing distributed systems.
I have used pytest for writing tests in Python since 2017. As far as I am aware, it is currently the best testing solution available for Python (when combined with items from Python’s built-in unittest library, of course). It is the testing solution that we are required to use when developing Python software for Euclid; every Python class in the SpectraDecontamination module that I develop for NASA and the European Space Agency has corresponding pytest-based tests.
I have toyed with a few different C++ unit testing frameworks, but GoogleTest seems to be the best overall. Unfortunately, I have less experience writing formal tests in C++ than in Python, since I have mostly been developing Python code since 2017. I’d definitely like to use GoogleTest more, though.
& Utilities
GNU Binutils
GCC
Debuggers (gdb, pdb, lldb)
Perf
Git
Profilers (gprof, cProfile)
Valgrind Suite
LLVM/Clang
Doxygen
SonarQube
Singularity
Docker / Podman
GNU Binutils are utilities for creating, manipulating, and analyzing binary programs. I have used these utilities to investigate and manipulate binaries since 2010.
I have used the GNU Compiler Collection for more than a decade, though I have only been using GCC’s built-in intrinsics and attributes since 2010.
I have used various debuggers periodically since 2011 in various ways; manually by running the code in debugging mode, using an IDE, and by attaching the debugger to a running process. I’ve also taught colleagues how to use debuggers, since many scientific software developers have very little or no exposure to debuggers.
On Linux, perf is a very powerful tool (or set of related tools) for doing detailed performance analysis. It’s helpful both when optimizing software and when configuring and testing systems, as a system administrator or systems engineer.
I have used Git for version control since 2011. Before that, I used Subversion a bit and I tried Mercurial; I prefer Git and I am delighted that it has essentially become the standard version control system. Of course, it’s not designed to handle large binary files, but Git LFS is a pretty good solution for that use-case.
I’ve been using profilers since I started working to optimize GSnap
in 2011 with gprof
. I use profilers in Python quite often as well.
The Valgrind suite contains debugging and profiling tools that go beyond the functionality of the standard GNU utilities. I have used the memory debugger, Memcheck, and cache profiler, Cachegrind. Valgrind is particularly useful for diagnosing segmentation faults and memory leaks in unfamiliar code (code written by other people).
I have used the LLVM-based clang and clang++ compilers since 2012. Clang++ is aware of most of the non-standard, built-in functions defined by GNU C++ compiler (g++), so code written specifically for g++ often works with clang++ without modification. Additionally, clang++ sometimes produces more efficient binaries than g++. I don’t have experience with using the LLVM backend directly (yet), but that’s certainly interesting.
I’ve used Doxygen for creating C++ source documentation in HTML format since 2011. I have also integrated Doxygen’s output with Qt Creator’s help system, which makes it easy for other developers to explore a project.
SonarQube is a continuous inspection tool that analyzes code quality with static code analysis and helps to enforce coding standards and encourage higher-quality code. It can also automatically run tests and report the test coverage. I have used SonarQube since 2017 when I began writing code for Euclid.
Singularity is a daemonless, rootless containerization solution originally aimed at the high-performance computing community. In 2021, it became a Linux Foundation project, named Apptainer. I have used Singularity for containerization since 2019 when I started working on the Joint Survey Processing project (we were planning to use the Open Science Grid for much of our processing; OSG uses Singularity for containerization). I also gave a tech presentation (tutorial) on Singularity at IPAC, explaining the ways in which Singularity is a better choice than Docker in many cases.
I first used Docker in 2014. The idea of containerization is great, but Docker is among my least favorite implementations. Given the choice, I use Singularity/Apptainer or Podman, rather than Docker for containerization, but I am familiar with Docker.
Systems
Linux system administration.
Familiarity with Linux system calls / system programming.
GNU/Linux has been my primary OS at home and at work since November of 2007 (primarily Ubuntu and RHEL). Over the years, I have built and managed many machines (physical and virtual). In 2014, I built a small Linux cluster for testing and developing Big Data tools. The cluster was eventually used by a few dozen people. Since 2023, I have led a small team of systems engineers, as the leader of NASA’s Euclid Science Data Center at IPAC, which consists of a Ceph storage cluster, a Slurm compute cluster, and a collection of VMs for services and individual users.