Projects


Some projects I've worked on for work, for class, and for fun


Goldman Sachs

Operations Analytics Strats







Automated Invoice Recognition

Optical Character Recognition


This was my main project as a Summer Technology Analyst in the summer of 2015. We worked with Accounting Services to build a tool that could automatically index certain values from images of invoices. They received invoices through email or fax and scanned them into .tiff files. Our algorithm relied on segmentation of individual values to produce structured output and templating to identify likely segment locations. Base optical character recognition (OCR) was performed using Tesseract.

The program was written in Python, using the numpy library for fast array operations and scikit-image library for image processing. We used the Tesseract open-source OCR engine.

Due to information security constraints, relevant documents and data may not be shared outside the firm




Smart Workflows

Mixed Integer Linear Programming


This was another project that I worked on as a Summer Technology Analyst in the summer of 2015. We worked with several different Operations groups to produce a tool for automatically assigning tasks to available analysts. This took into account available analysts, their expected bandwidth for the remainder of the day, their proficiency with incoming tasks, and implemented a Maker-Checker process. The tool is to be run by managers, and outputs a list of each person's assignments as well as a list of unassigned tasks with reasoning listed.

The program was written in R, using the Rglpk library for linear programming.

Due to information security constraints, relevant documents and data may not be shared outside the firm


Princeton University

Operations Research and Financial Engineering







PARSNIP: Lightweight Macro Recording Utility

Chrome Extension


PARSNIP was a project that I worked on for COS 333: Advanced Programming Techniques in Spring 2015. I worked with David Zhao, Timothy Seah, and Eric Huang. This is a Google Chrome extension that allows for basic macro recording and playback, with the option of scheduled macro playback and macro storage and retrieval.

PARSNIP was written in JavaScript and HTML.

Relevant documents: Technical Document, Project Report, Demo Presentation




Learning Rate Analysis

Reinforcement Learning


This was a project that I worked on as a Summer Research Intern at CASTLE Labs at Princeton University in the summer of 2014. The objective was to find optimal policies in an energy allocation problem. The model contained a battery, power grid, wind energy, and demand, with the latter three stochastic variables. I investigated the performance of Q-Learning and SARSA given different learning rates.

The simulator was written in Java, building on the BURLAP Learning Algorithm Library. Graphics and metrics were produced in Excel and MATLAB.

Relevant documents: Problem Model, Project Presentation




Los Angeles Freeway Pricing

Optimal Learning


This was a project that I worked on for ORF 418: Optimal Learning in Spring 2014. I worked with Max Kaplan. We created an algorithm for pricing Express Lanes on the I-110 Freeway using Optimal Learning techniques. Specifically, we tested the Knowledge Gradient, Interval Estimation, Pure Exploitation, and Constrained Exploration algorithms with linear and logistic belief models.

Code and graphics done in MATLAB.

Relevant documents: Project Report, Project Presentation

ORFE Senior Thesis

Natural Language Processing


I'm currently writing my senior thesis for the Operations Research and Financial Engineering department at Princeton University. I'm being advised by Professor Xiaoyan Li of the Computer Science department.

Currently, I'm thinking of using R and Python for the data analysis.



© SHUYANG LI 2015