As my extra credit project, I propose to examine what factors drive energy use at office buildings in Washington DC.Washington is one of a small number of cities that have energy disclosure laws. While the dataset was introduced later than New York's Local Law 84, it provides rich data with which to analyze and benchmark energy use.Motivation:Washington DC has a significant number of large apartment complexes with more than 100 residential units. However, these differ in age and design characteristics: some date from 1900-1945; many others were built in 1945-1975; and a new generation of large, expensive apartment blocks is going up.Should the city government focus on the older residential buildings - which may have poor build quality and lack good insulation - or should it focus on the fancy new buildings, which are ostensibly high-tech and energy efficient, but also have swimming pools and large floor layouts? The project should provide an answer by prioritizing which set of buildings to focus on based on their relative energy consumption.Data availability:The Washington DC Department of Energy and Environment (DOEE) publishes annual data including EUI / sq foot, and weather normalized EUI.City planning data makes available floorspace, building age, and amount of the lot covered by the building.Approach:The project would merge and clean the relevant datasets.A linear regression model would be constructed with building EUI as the dependent variable.During the exploratory phase, additional data columns may be discovered and incorporated into the model.The validity of the regression model would be tested against a training set of Washington DC buildings, and the model would be optimized through feature selection, as well as checking and correcting for multi-collinearity.Expected outputs:A simple predictive model for apartment building energy consumption in Washington DC together with its results.Identification of the top large apartment blocks whose EUI differs from the predicted value, visualized on a map.Identification of features associated with high energy use (such as the energy use tendencies for 1950s, 1970s and 2000s buildings).