$blog→errstr

do not commit

Concrete to Abstract Semantics

| Comments

Introduction

This is the crucial step: abstracting the concrete semantics away so that we can do control flow analysis (CFA). Fortunately, in small-step abstract interpretation, the concrete and abstract semantics are very similar.

In this article, I introduce the concept and need for control flow analysis and show how to convert concrete semantics from the definition of my Concrete CESK machine into abstract semantics in preparation for creating an Abstract CESK machine.

Concrete CESK for Android’s Dalvik

| Comments

Introduction

At the heart of all Android applications is Dalvik byte code. It’s what everything gets compiled to and then run on the Dalvik VM. In order to do static program analysis for Android, you need to somehow interpret the byte code. That’s where the CESK machine shines.

The CESK machine, developed by Matthias Felleisen, provides a simple and powerful architecture used to model the semantics of functional, object-oriented and imperative languages and features like mutation, recursion, exceptions, continuations, garbage collection and multi-threading. It is a state-machine that takes its name from the four components of each state: Control, Environment, Store, and Kontinuation.

In this article, I implement a concrete CESK machine to interpret a dynamically typed object-oriented language abstracted from Dalvik byte code. Every byte code and its semantics have been transformed into this language.

Implementing a More Powerful Regular Expression Engine in 3 Languages

| Comments

Introduction

In class today, I sat through my second lecture on the power behind Regular Expressions with derivatives. The professor live coded a Regular Expression engine in Python during class that has all of the regular language functionality and more than the ones you will find in perl, ruby, python, java, boost, etc.

Why would language implementers not use the more powerful way? Because long ago it was thought that Brzozowski’s derivative method was too costly so everyone used Thompson’s method which is fast for most operations but suffers from possible exponential blowup with some operations.

With derivatives we get those operations back (Intersection, Difference, Complement) and significantly decrease the complexity of implementing regular expressions.

I will implement all operations for regular languages in Python, Javascript and Ruby for fun and posterity.

On Static Program Analysis

| Comments

Introduction

As I stated in my previous post, my undergraduate thesis is on static program analysis of Android applications to prove malicious behavior. In this article, I try to explain what static program analysis is and isn’t and the rational behind using using it for this research. Oh, and there is code to look at, too!