do not commit


| Comments

I sit at my desk trying to find some words to preface this post. The middle finger of my right hand continually hits the “delete” key on my macbook keyboard as I crumple up thoughts into paper balls and throw them aside.

A year ago today my grandpa died.

Concrete to Abstract Semantics

| Comments


This is the crucial step: abstracting the concrete semantics away so that we can do control flow analysis (CFA). Fortunately, in small-step abstract interpretation, the concrete and abstract semantics are very similar.

In this article, I introduce the concept and need for control flow analysis and show how to convert concrete semantics from the definition of my Concrete CESK machine into abstract semantics in preparation for creating an Abstract CESK machine.

Concrete CESK for Android’s Dalvik

| Comments


At the heart of all Android applications is Dalvik byte code. It’s what everything gets compiled to and then run on the Dalvik VM. In order to do static program analysis for Android, you need to somehow interpret the byte code. That’s where the CESK machine shines.

The CESK machine, developed by Matthias Felleisen, provides a simple and powerful architecture used to model the semantics of functional, object-oriented and imperative languages and features like mutation, recursion, exceptions, continuations, garbage collection and multi-threading. It is a state-machine that takes its name from the four components of each state: Control, Environment, Store, and Kontinuation.

In this article, I implement a concrete CESK machine to interpret a dynamically typed object-oriented language abstracted from Dalvik byte code. Every byte code and its semantics have been transformed into this language.

Implementing a More Powerful Regular Expression Engine in 3 Languages

| Comments


In class today, I sat through my second lecture on the power behind Regular Expressions with derivatives. The professor live coded a Regular Expression engine in Python during class that has all of the regular language functionality and more than the ones you will find in perl, ruby, python, java, boost, etc.

Why would language implementers not use the more powerful way? Because long ago it was thought that Brzozowski’s derivative method was too costly so everyone used Thompson’s method which is fast for most operations but suffers from possible exponential blowup with some operations.

With derivatives we get those operations back (Intersection, Difference, Complement) and significantly decrease the complexity of implementing regular expressions.

I will implement all operations for regular languages in Python, Javascript and Ruby for fun and posterity.