Date of Award

Fall 2018

Project Type

Thesis

Program or Major

Information Technology

Degree Name

Master of Science

First Advisor

Michael Jonas

Second Advisor

Mihaela Sabin

Third Advisor

Timothy Chadwick

Abstract

The objective of this thesis was to explore the topic of scalable web development, and it answered the question, “How do you scale a website to handle more traffic at peak times without wasting resources?” This is important research to any web company that has issues with rising costs as demand for their website increases. It would be wise for every online business to be prepared for more web traffic, before it occurs, without spending the budget of a multi-million user web company in low traffic periods. The last thing you want is an error as your customer base starts to arrive, giving them a bad experience for their first impressions, which would result in lost revenue.

Scalable software development architectures, including microservices, big data, and Kubernetes were studied, in addition to similar web service companies including Facebook, Twitter, and Match.com. A scalable architecture was designed for a social media web service, MeAndYou, using the big data configuration with a shared Aurora database, which was configured using an auto-scaling group attached to a load balancer in Amazon Web Services (AWS). It was tested using a custom threaded Selenium-based Python script that applied simulated user load to the servers. As the load was applied, AWS added more Elastic Compute Cloud (EC2) instances running a virtual disk image of the web server. After the load was removed, the instances were terminated automatically by AWS to save costs.

Countless steps were taken to make the web service bigger and more scalable than it originally was, before testing, including adding more fields to user profiles, adding more search types, and separating the layers of code into different Hypertext Preprocessor (PHP) files in the front-end. A version control system was configured on the servers using GitHub and rsync. The systems architecture designed suggests the Match Engine should use a stream processing message queue, which would allow the system to factor searches one at a time as they are created, with horizontal scaling capabilities, rather than grabbing the entire database and storing it in memory. The backend Match Engine was also tested for accuracy using Structured Query Language (SQL) injection, which determined how the match algorithm should be improved in the future.

Share

COinS