Resources for Sr Project Students
Senior Project Descriptions
Some of these projects are difficult to describe in a one-way presentation, so you may benefit from talking to me about them.
I have tried to organize them into categories for you to more easily browse through them. That said, I highly recommend that
you take a look through everything. In any event, some of my preliminary project ideas are outlined below
Product-based Senior Projects
The following are product based projects where you I would expect you to conduct a product development cycle much like a
company startup activity. That is, you would need to develop a basic market need statement, interview prospective customers
to learn their interest and willingness to spend money on the product that you propose. It would be expected that this
potential customer interviews would cause you to refine (and possibly completely redesign) the envisioned product.
- Stop vehicular reporting of personal location. New vehicles
are collecting data on people at
astonishing rates. Evidently new vehicles are collecting data on your race, weight, sexual activity, driving behaviors,
location, and even more. This problem was also discussed
on reddit.
How do we stop this? Is it possible? Perhaps this is a car-to-phone security device; but then, how does that help? The
phone can always report you (that said, perhaps the phone has to live within more restrictive laws than those for vehicles).
This project proposes to examine this issue and design/develop hardware, software, or user practicies to stop (or at least
reduce) the amount of information collected and (more importantly) reported.
- Develop infratructure to support locating and reserving automobile charging stations when traveling. Setup
capabilities for restaurants and other services to advertise realtime charging time slots and availabilities for traveling
vehicles to schedule usage.
- Build an android app called onRoute that will locate a specified service forward of your travel on a mapped trip
(mapped out by google maps, waze, or openstreet map [or for that matter, any travel routing software that works on an
android device]). My family has a running joke about dad's (my) ongoing quest to locate a CiCi's pizza while we are
traveling (they are relatively rare and difficult to locate). Of course we also spend time looking for a Coffee service
too. Currently one of us in the car has to manually search on our phones for a suitable location of the desired service we
are seeking. Ideally I would like to see an android app that will locate a service from the following information: (i) the
location of the car, (ii) the route we are following, (iii) the service that we're seeking (generic such as pizza or
specific such as CiCi's pizza), and (iv) the distance we are willing to deviate from our route for said service. I have no
idea if this is a feasible project, so it will require a significant amount of upfront work to determine if it can even be
done with the existing APIs of the various mapping tools. Of course more complex requests could be supported as well. For
example, it would be nice to have the ability to know gas stations with gas prices (something like gas buddy) to optimize
the search for said service.
Student team size options: 2-4
- James Bond Coke Machine (fight the Pepsi monopoly that has the UC campus by the throat). I hate Pepsi (doesn't
any sane person?) and yet I'm forced to drink it as it's really the only option on campus. This project proposes the
construction of a coke machine hidden behind the facade of a Pepsi Machine. While it would be easy to simply fill one of
the slots in an existing Pepsi machine with Coke, what about Diet Coke or Sprite drinkers? I propose that the James Bond
Coke machine would mimic a Pepsi machine, even delivering standard Pepsi products when used by the naive. However, the
machine should be designed so that an undocumented panel is used to invoke a transformation to a full-fledged Coke machine
dispensing Coke products. After dispensing the desired Coke product, the machine will revert back to the standard (and
disgusting) Pepsi setup.
Student team size options: 1-4
- Build a Smart Home/Office IoT Device. For the most part I see smart home discussions regarding the network
connection/instrumentation of major appliances (refrigerators, washer/dryer, etc) as pointless solutions to problems that do
not exist. That said, I can also see how some smart home concepts make sense (thermostats, lighting control, etc). In this
topic, I suggest the development of fringe devices that provide some capabilities for the more "involved" consumer. For
example:
- build
a
programmable espresso machine (developer
link) for the coffee aficionado;
- android app to track cellphone location to redirect google-voice calls to a local land line that the user registers
to the app
- build a user aware light switching system that is programmable to have day/night operation and that will emulate
past behavior when it detects that the homeowner is not present;
- build a programmable (possibly w/ fractal based generation) mood lighting solution;
- Let's make something up. I'm always up for new ideas and suggestions.
Student team size options: Depends
on the project.
Research Senior Projects
Attached is a list of my currently outlined research senior project ideas. I generally revise this list in the month or two
before the fall semester so please check back for revisions to this list. For now you should consider this list as a
collection of ideas that can serve as potential projects. For the most part, my research is in High Performance Computing (HPC)
with applications to Topological Data Analysis (TDA) and Parallel Simulation (PDES). Students should expect to work with
one of my research teams and with the preparation of research publications. My work and thoughts are constantly evolving, so
please feel free to res each out to me (Philip A. Wilsey) with questions and
discussions about potential projects if you want to get started early with planning or an early start on something. While the
prose descriptions contain projections on sizes for the research senior projects, most of these projects can also be
individual efforts.
Each of these projects ties into one of my main research areas. You would be joining a team of students studying these, and
related, problems. It is most likely that you will also be preparing and submitting manuscripts for publication from the work
you would perform on these projects. Do not worry if the projects/concepts described below are confusing to you. I am happy
to discuss these projects more fully with you directly. In addition, if you work on one of these, you will be plugged into my
research teams where I have multiple graduate students that you would be working with.
Most likely the ideas described below will only make partial sense to you. Do not worry about that, it is normal. We will
provide the background and training that you will need to conduct these studies.
Related to algorithms and methods for Topological Data Analysis
Topological Data Analysis (TDA) is a method of data analysis that extracts characteristics of the topological
structure found in data. Effectively TDA treats Point Cloud Data (PCD) as a sampling of a topological space (metric
space, topological manifold, etc). TDA techniques then work to characterize the topological shapes found in the topological
space represented by the PCD. My lab works to develop techniques to optimize TDA algorithms to operate on big data and
higher-dimensional data (ℝ3 to ℝ16). To this end, we have developed
a Light-weight Homology Framework (LHF) containing the software to support our
studies.
Our work is primarily focused on a TDA method called Persistent Homology. PH has exponential complexity in both
time and space and its application is generally limited to PCD containing approximately 10K points in ℝ3.
Several of my students and I have been developing methods, theories, and algorithms to partition much larger, high-dimensional
data sets so that they can be analyzed using TDA techniques. In more recent work, my students have also been developing new
TDA methods (based on the Euler-Poincare Characteristic) as well as new methods for managing and optimizing the internal
representations used to hold the PCD during the TDA algorithms. While this all may sound pretty far out there, it is quite
interesting and mostly an effort to develop and implement approximate methods and algorithms in Python, C++, and other
supporting languages (Julia, R, etc). Sometimes parallel and distributed computing as well as heterogeneous parallelism
(FPGAs and/or GPGPUs).
As part of this project, I am open to a wide range of possibilities. You can work on ideas enumerated below, or work with me
and my team to develop customized ideas for TDA related projects of your own design (you do not have to have the initial
ideas, often I find that its fun to simply begin studying/thinking and opportunities present themselves to us).
All of the prose in this section may seem quite foreign and filled with a considerable amount of math jargon. Don't worry
about it, most of the work is straight up algorithm development and implementation. In addition, my graduate students are
all heavily invested in these topics so they (and me, but they're the real brains of the team) will be available to
help explain things to you (and me).
- Implement an Alpha Complex using the high performance Delaunay triangulation algorithm developed by [Hornus and
Boissonnat 2008]. The first step in the processing of point cloud data for our work is the construction of a simplicial
complex. Basically a simplicial complex (or simply complex) is a connectivity graph formed as a
collection of connected triangles. There are multiple ways to build a complex; however one of the most efficient (smallest)
representations is built from a Delaunay triangulation. Unfortunately the computation of Delaunay triangulation is
computationally expensive. Currently the most efficient constructions are built using a Qhull construction.
Hornus and Boissonnat have developed a alternate construction method that is significantly faster. I would like to include
an implementation of this in our LHF code base (as the construction of an Alpha Complex).
Student team size options: 2-3
- Study/optimize the embedding of portions of the ripserer persistent homology tool into an FPGA. The general
embedding of the code into the FPGA has already been performed. However, as with most work with FPGA/GPGPUs the challenge
is to organize the algorithm to access the data in a way that minimizes transfers between the host and the device.
Student team size options: 1-2
- Optimize and parallelize various elements of the LHF pipeline. Some of my graduate students have been doing some
interesting optimizations that really need help to speedup the implementation. Some of these are interesting techniques to
construct and optimize the graphs used to represent the topological manifold approximated by the input data. The project
would involve profiling and optimizing the LHF pipeline and the subsequent study of the performance implications of your
optimizations as the data size and dimension changes.
Student team size options: 1-4
- Convert our MPI-based parallelization of PPH to a map-reduce implementation.
Student team size options: 1-2
- Make contributions to our python library supporting our studies in TDA. Some possibilities:
- Use/extend python programs to generate synthetic test data. Specifically generating synthetic high-dimensional
test data with known homologies (topological features).
Student team size options: 1-2
- Build infrastructure to build/deploy python bindings for the LHF Library. Expand our github repository with
continuous integration testing, online documentation and/or with python bindings and pip3 install support. It is possible
that some support projects might be helpful.
Student team size
options: 1-2
- As indicated above, the computation of persistent homology suffers from exponential complexity. We have worked out the
theory and initial implementations of an approach to partition the data so that we can expand the computation for point
clouds of a size that is 3-5 orders of magnitude greater than current solutions can provide. We are calling
this Partitioned Persistent Homology (PPH) There are always opportunities to contribute to the development and
optimization of PPH.
Student team size options: 1-3
- Persistent homology is frequently performed using a Vietoris-Rips Complex (VR Complex). In addition to an
implementation of VR complexes, we also have implementations of an Alpha Complex and Witness Complex that both
have a significantly smaller memory footprint than a VR Complex. There are numerous opportunities to work on the various
complexes and especially the Witness Complex. In particular, we would like to connect the selection of landmark points
for a Witness Complex to the sampling techniques used by PPH that would serve as a very interesting research senior
project.
Student team size options: 1-3
- I also have some strange ideas that might permit us to apply persistent homology to high dimensional point
clouds. These ideas are currently in conjecture form with preliminary evidence supporting their utility.
Student
team size options: 1-2
- We are beginning to work on Streaming Persistent Homology. That is computing persistent homology on streaming
data. This requires solution for incremental complex and boundary matrix reductions. We have preliminary work completed
and one publication on this topic.
Student team size options: 1-3
- Extend the LHF Python output analysis/visualization library to process the persistent homology output
results.
Student team size options: 2-4
- Develop CUDA-based codes of our algorithms for execution on an NVIDIA GPU. Numerous parts of the algorithms in
LHF are reasonably well suited to computation in a GPGPU.
Student team size options: 2-3
Related to High-Performance Parallel Simulation on Multi-core Beowulf Clusters
One of my main research areas is in parallel and distributed discrete-event simulation using an optimistic synchronization
strategy called the Time Warp Mechanism. Time Warp is an optimistic synchronization strategy that does not
explicitly synchronize the asynchronous event processing. Instead a Time Warp simulation kernel will react whenever a causal
event violation is detected to implement some type of recovery mechanism. This is not as strange as it might sound.
Basically a Time Warp synchronized parallel simulation will maintain state information necessary to undo a premature
computation and restore/recompute the (partial) simulation path for the local object with the dependent event information
processed accordingly.
As part of these studies, my students and I have developed a general purpose discrete event simulation kernel and modeling
API, called WARPED2, that is configurable for sequential or parallel
execution and it includes many configuration variables to turn on/off various optimizations/sub-algorithms for the Time Warp
setup. The WARPED2 parallel simulation engine has been designed to have support for multi-threaded and task level parallelism
for execution on multi-core and many-core Beowulf Clusters. This project also has a complementary set
of discrete event simulation models. As part of this project, there
are a number of possible senior project experiences that we can offer:
- Explore some of the newly emerging Ethernet communication optimizations in the Linux kernel. Discrete-event
simulation is mostly composed of fine-grained communicating sequential processes. While careful partitioning and process
scheduling can help hide communication latency costs, there is always a need to improve on this situation. Newer 10Gb
networking cards with iWARP or RDMA over Converged Ethernet (RoCEv2) can provide lower latency; Linux kernel solutions such
as kernel bypass (DPDK or the newer XDP) can also help bring down latencies. This project will explore and develop at least
one of these solutions to profile their impact on performance of the WARPED2 run-time performance.
Student team size
options: 2-4
- Restructure the WARPED2 simulation kernel with lightweight locking and queues to reduce overhead and lock
contention. I have a working design already in place.
Student team size options: 2-4
- Use the parallelization capabilities of c++-23 for the implementation of the WARPED2 simulation kernel
Student team size options: 1-3
- Implement the WARPED2 simulation kernel and at least two simulation models from warped2-models in the RUST
or julia programming language and profile its performance vis-a-vis the C++ kernel.
Student team size
options: 2-4
- Develop and build new and, ideally, scalable simulation models to help exercise the WARPED2 simulation kernel.
While I am willing to entertain ideas on what types of models you would develop, one possibility would be ocean currents and
the accumulation of debris in various regions based on the current flow and possibly weather patterns. In some cases these
projects may require an out-reach to other researchers in the world that are experts in the fields we are attempting to
model.
Student team size options: 2-4