Predicting commercial real estate prices with machine learning
- Description:
- In this project, I aim at developing a machine learning application that identifies the investment opportunities in the commercial real estate market in Moscow.
- The application is implemented as a regression problem that tries to estimate the price of real estate property given features retrieved from public online listings.
- Classical statistical models as well as machine leraning methods have been tested, including linear regression, LASSO, RIDGE, SVR and gradient boosting.
- Technologies:
- Python, Pandas, Numpy, Scikit-learn, CatBoost, Scikit-Optimize, Requests, Beautiful Soup.
- Data:
- Cross-sectional dataset of commercial real estate prices together with several features.
A web scraper has been developed for real-time streaming of data.
- Code:
- Jupiter Notebook
Time series forecasting with neural networks
Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows one to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. MXNet is portable and lightweight, scaling effectively to multiple GPUs and multiple machines.
The Gluon package is a high-level interface for MXNet designed to be easy to use, while keeping most of the flexibility of a low level API. Gluon supports both imperative and symbolic programming, making it easy to train complex models imperatively in Python and then deploy with a symbolic graph in C++ and Scala. Based on the the Gluon API specification, the Gluon API in Apache MXNet provides a clear, concise, and simple API for deep learning. It makes it easy to prototype, build, and train deep learning models without sacrificing training speed.
- Description:
- In recent years, advances in deep learning have attracted growing interest in utilising neural nets for economic time series forecasting, though currently no consensus has been achieved on the superiority of deep neural nets against the traditional methods like ARIMA and ETS.
- GluonTS is a newly released Python toolkit that aims to facilitate the study of deep learning models for time series forecasting. The package itself is built by machine learning scientists from Amazon around Apache MXNet (an open-source deep learning framework). Apart from several ready to be trained deep learning models it contains convenient wrappers to other time series forecasting packages, such as Forecast or Prophet, which makes it easier to compare different forecasting methods.
- In this project, we will use GluonTS to study the forecasting performance of neural time series models as well as some traditional forecasting techniques on weekly data of Brent Crude Oil Prices.
- Technologies:
- Python, GluonTS
- Data:
- Time series dataset of BRENT oil prices.
- Code:
- Jupiter Notebook
Stochastic frontier analysis of regional competitiveness
- Description:
- According to the OECD, a competitive region is one that can attract and maintain successful firms and maintain or increase standards of living for the region’s inhabitants. Skilled labour and investment gravitate away from uncompetitive regions towards more competitive ones.
- The extension of the competitiveness concept to the regional level is recent but is having a major influence on the direction of regional development policy. The increasing importance of competitiveness issues may be explained by the deeper economic integration and increased globalization, which require a constant increase in the competitive power of every economic entity belonging to a certain country.
- In this project, I have used stochastic frontier analysis (SFA) to measure the regional competitiveness of federal subjects in the Central Federal District of Russia in the spirit of "Stochastic frontier analysis of regional competitiveness" by Furková and Surmanová (2011).
- Technologies:
- R, Frontier.
- Data:
- A balanced panel data set of 18 regions observed over a period from 2004 to 2015, which includes 216 observations in total.
- Code:
- Jupiter Notebook
Real options valuation
R Programming Language
The YUIMA project is an open source academic project aimed at developing a complete environment for estimation and simulation of Stochastic Differential Equations and other Stochastic Processes via the R package called yuima
Python
Xlwings is a Python library that makes it easy to call Python from Excel and vice versa. Statsmodels is a Python package that allows users to explore data, estimate statistical models, and perform statistical tests
Excel
Microsoft Excel is a spreadsheet developed by Microsoft for Windows, macOS, Android and iOS. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications
- Description:
- In this set of mini-projects I have used different technologies, like Python, R, and Excel, to perform real options valuation of a fictional soybean processing plant.
- Several popular ROV techniques have been implemented such as binomial lattices and least-squares Monte Carlo method (LSM).
- Technologies:
- Python, R, Excel, Yuima, xlwings, Statsmodels
- Data:
- Time series dataset of soybeans, soybean oil and soybean meal prices.
- Code:
- GitHub repository
| # | Model name | Link |
|---|---|---|
| 1 | Binomial optons pricing model | Jupiter Notebook |
| 2 | LSM, single stochastic factor | Jupiter Notebook |
| 3 | LSM, multiple stochastic factors | Jupiter Notebook |