Contributors: Lalit Singh, Ibrahim Malik,Sagar Kumar, Raghav Natarajan
Airbnb is an online marketplace that allows its users to list and rent short-term lodging facilities. The cost to be charged to the customer is decided by the owner of the property while Airbnb receives service charges per booking from each party. Founded in 2008, it is a private business headquartered in San Francisco, California. Airbnb became the market leader in its field through acquisitions of some of the biggest firms competing against it. Today, it has nineteen major offices spread all over the world. The full form of its name is AirBed and Breakfast, which is self-explanatory in terms of the main services provided. The services expanded from the provision of single rooms or beds to entire housing properties very soon.
To highlight the importance of big data to Airbnb, here is a quote from famous best-selling business author Bernard Marr: “Airbnb are a perfect example of a fast-growing company with ever-expanding Big Data needs. The ability to shift and adapt as the company have grown has, I think, been at the heart of their success. This highlights the non-static nature of Big Data and how your data strategy may need to change over time to cope with new demands.” (http://www.cloudcomputing-news.net/news/2016/may/09/airbnb-how-big-data-used-disrupt-hospitality-industry/)
Current use of big data in AIRbnb
Airbnb has to sort through a lot of data; approximately 11 petabytes. Data types include information on a guest’s lodging preferences or whether a guest likes a room with an exceptional view. Using certain algorithms, Airbnb can provide a room best suited for the particular guest. Preferences of this sort lie under four main data categories: 1. Behavioral, meaning how a customer interacts with their website, 2. Dimensional, including data like user attributes, ranging from which device they used to access or their location, 3. Sentiment, including data like customer feedbacks, hotel reviews and ratings and survey statistics, 4. Imputed, which consists of data pertaining to an individual guest. For example, using the algorithms in place, Airbnb can conclude whether you like to stay in rooms with a hot-tub or whether you like a room with a great view. Airbnb divides their data into “gold” and “silver”. Gold is the “source of truth” where, according to the staff, the most important data resides and in silver, where a duplicate of the data exists and where data scientists can query or process or find patterns with the data and not interfere with the real-time data. The data used by Airbnb, received through their customers, consists of structured and unstructured formats, like host photos, location data, accommodation features (WiFi, hot-tubs) or customer feedback. Moreover, external data processing can also take place. For example, Airbnb can provide their customers with more expensive hotel rooms if an event is taking place; for instance, doubling the price of a room in Brazil near the Olympics center during the 2016 Olympics. The data is stored in a public cloud in amazon (based on the Hadoop open-source programming framework).
Extension of services using big data
Residence industries like Airbnb can extend their services by making use of databases from their own servers and also from the travel industry. Currently, Airbnb provides a smart pricing feature, by which the host can allow Airbnb’s system to set up the price for rent of their apartment/houses/rooms automatically. The airbnb smart pricing system adjusts the price of rent, based on supply and demand in the area for the date, and the features, location, amenities, booking history, and availability of those rentals. However, it doesn’t consider the amount of air, land , water traffic that will occur in the given area in the given time. Using the data related to flight booking, bus booking, etc. it is possible to estimate the number of people who will be moving in and out of the city during the particular time. By the use of these data, residential service companies like Airbnb can increase/decrease the number apartment/rooms for rent and accordingly and adjust the price of the rent as well. Furthermore, by analyzing the pattern of travel routes of the past travellers, it can make suggestions to future travellers. For example: Suppose that after analyzing the travel routes of a large number of tourists, it is found that a tourist visiting Paris has a high chance of making his/her next trip to Switzerland. Airbnb can include features such as combined trip planning and discount features for particular sequences of travelling, so as to provide a better experience to the travellers.
Airbnb Service Without Big Data
Being one of the largest marketplaces for people to find and rent accommodations, Airbnb would not be able to function without the use of Big Data. Airbnb provides its service to over 100 million customers from around the globe, and uses data from these customers to analyze the types of accommodations they need. Without the use of Big Data, Airbnb would face many difficulties with providing their service. As mentioned before, Airbnb uses more than 10 petabytes (10 million gigabytes) of data. Without using technology, Airbnb would have to employ tens of thousands of people in order to manually analyze all the information from their customers. Not only is this method more costly, it is also more time consuming. Pricing of accommodations are also set through the analysis of customers wants, needs and the location of the accommodation. Without data mining techniques, Airbnb would have conflicts with hosts and customers about pricing issues and there would be no quick and easy way to match customers with their preference of accommodation. Not only does Big Data cater to customer needs, it also takes into account the preferences of the host. Without this feature, it would be impossible for hosts and customers to come to an agreement about the living arrangements. The use of Big Data is necessary when operating a large scale business like Airbnb. To data mine millions of pieces of data manually would take far longer than having it done automatically on the data servers (or computers). It would be almost impossible for Airbnb to provide their service worldwide without the use of Big Data.
Advantages of using big data
- Personalization of customer services
Different social media platforms allow third party websites to access user information with the permission of the users via their public API. For instance, when a user connects their Airbnb account with their facebook account, Airbnb has access to the list of their facebook friends, facebook timeline and much more. Therefore, it can access information about the travel history of any user and their friends. If any 5 random people in the friends list of a particular person visit a particular place and 80 % of them voted 4-5 stars for their travel experience, then Airbnb can make suggestions like “ Your 10 friends have recently travelled to Place X and rated 5/5 for their overall travel experience. Would you like to plan your trip to Place X? We have an awesome travel package with a heavy discount.” Such a personalized suggestion evokes emotional excitement in the user and makes them more likely to accept the offer.
- Expansion of services based on future demand prediction:
The data from the travel industry has a lot to tell about the demand of residence services. If more people are entering a city than the number of people who are leaving the city, then, for sure, the demand for food and lodging services will increase in that city and vice versa. By analyzing the data from the past, it is possible to predict at what time there will be more demand of different services. Thus, using those data, different travel and residence companies can increase or decrease the expansion of their services.
- Risk analysis can be performed
It is extremely important for a company to continuously analyze its environment and make use of the data to evolve. Big data allows Airbnb to keep up with changes in the market and technology, and also new trends. It keeps Airbnb well informed of its customers and even the slightest changes in their behaviour. This allows them to be proactive when it comes to evolution and making their services better, saving time and the costs that would have been incurred had they gone for more manual forms of market research.
- The most suitable prices can be reached
Airbnb has introduced a feature called Price Tips, which tells the landlord how likely it is for them to receive customers at the current price that they have set. “Hosts can glance at a calendar and see what dates are likely to be booked at their current price (green) and which aren’t (red), and they can get price suggestions as well. When hosts price themselves within 5% of the suggested price, they are “nearly four times” as likely to get a booking as when they don’t, Airbnb said.” (Forbes, 2015). These price tips ensure that just the right amount is proposed by the host and all opportunities to make money are taken advantage of. If the host’s price were to be too much lower than the optimal range, they would be missing out on opportunities to make more. On the contrary, if they were too much higher, they would not receive any bookings and miss out on a money making opportunity. Another advantage, which is particularly for Airbnb itself, is that it effectively gives Airbnb control over what prices will be set, since the users are so convinced by its system of price suggestions.
- Big data creating jobs
Due to the amount of data the data scientists have to analyze and examine, it is almost impossible for only a few number of people to achieve this with almost 1.5 petabytes of data. The advantage in this case is that due to the sheer amount of data that has to be analyzed, lots of people are employed. Airbnb, for example, initially started off with three data scientists but ended up with more than one hundred by the end of the year. Now with over 2000 employees, Airbnb has secured great jobs for people around the globe.
- Making our cities smarter
Due to the vast amount of data, and fast expansion of cities, lots of big cities like New York, Shanghai, Tokyo and Sydney need information through big data tools. For example, in Oslo, Norway, the community was able to reduce the street lighting energy consumption by more than 60% with the help of big data. In another case, in Memphis Police Department, the city was able to reduce its crime rate by almost 30% by using a predictive software also coupled with the help of big data (for eg. Information on where most criminal events take place, etc.).
- Storage of Data is unlimited
Nowadays, Big Data can be stored online on cloud platforms. This enables Big Data to be accessible from virtually anywhere from various devices. For Airbnb, customers and hosts can access the service either through an app on the phone or through the website. As it has over 11 petabytes of data, storing the data on the cloud is a great advantage. This option also enables mobility to the service, as customers are not restricted to just one way of accessing Airbnb’s services.
- Big Data is processed very fast
The speed at which Big Data is analyzed and processed is all due to the use of modern technology. Data is processed at much higher speeds than it was before the creation of such technology. Modern analytical methods, technologies and tools allow analysts to gain very deep insights into Big Data, which was impossible in the past with limited data volumes and weaker processing tools.
Risks and disadvantages of using big data
- Lack of privacy
Each and every online activity of the user is tracked by travel service companies. So, the customer might feel insecure. This might have negative effects on the user’s emotional behaviour. In case the data (for example location, travel route, search history, etc) are hacked and accessed by unauthorized personnel, it might cause serious threats to the user as well as the company’s reputation. Also, people might want to travel secretly to different places; however, in the process of giving suggestions, the travel websites might leak the information about the travel history to the user’s friend lists.
- Poor analysis
No matter how credible the data at their disposal might be, a major risk they face when using it is that their analytics might prove to be wrong. This means that they could draw causal links between factors that are actually not related, just by seeing corresponding trends between those factors in their data, which may have occurred due to a third factor which is actually corresponding to both of these two factors, or merely by coincidence. For example, if, according to the data, the demand for accommodation at warmer locations increases in November in a certain region, it increases not because November has arrived, but instead has merely to do with the fact that November is generally cold in that region. If, due to climate change, this year’s November turns out to be warmer than usual, then the demand for warmer accommodation will not go as high. In such a situation, the algorithm might still suggest a higher price in November for warmer accommodation, which would result in lesser bookings than could have been achieved, and so, an overall loss for hosts, guests, and Airbnb alike.
A major disadvantage of using big data is the costs it incurs on the company to collect, save, and keep the data secure from breaches of confidence. For collection of data, it requires designated departments to design systems which would persuade customers to provide the company with their feedback. Foolproof systems of data storage need to be introduced and maintained by the firm to ensure the safety of data, and these practices impose large costs on the firm.
One way to make the cost problem more efficient is to collect data at the correct times, and ensure that the information bias is not taking place – i.e. every piece of information that is being collected affects action and is useful. This would mean cutting down on lots of data collecting streams to only keep the ones that produce effective results, thus decreasing the costs of collection and maintenance by a significant amount.
- Keeping up with the company’s dramatic growth
Airbnb initially started with 3 employees, and by the year’s end, had more than 10 international offices with hundreds of employees. This meant that the data team couldn’t possibly partner with each other’s regional companies. As the chief of data science at Airbnb (Riley Newman) put it, “We needed to find a way to democratize our work, broadening from individual interactions, to empowering teams, the company, and even our community.” Since there are over 7 billion people, all with their personal data, Airbnb had to invest in a much larger staff, buy more reliable and fast technologies in order to keep up with the ever growing sea of data. Shortly put, this is a costly method however there is no way clear around it.
- Time constraints
Since the data collected by Airbnb is collected and processed in real time over programming languages like Structures Query Language (SQL) and Python, that have to run long and tedious algorithms, getting results and patterns is usually very time-taking. Algorithms like these take some time due to the heavy amount of data that is considered and taken as input is very large. The problem rises when there is a delay. In the travel industry, minutes can be years. This means that if a system fails or an algorithm takes too long, the customers can either miss out or Airbnb, or can lose potential business costing them hundreds of thousands of dollars.
- Big Data isn’t always secure
Handling of Big Data is a very tedious job, that’s why most companies that do use it, tend to keep the data in data warehouses, as they are more secure than cloud platforms. As the cloud is more susceptible to security threats, data is also kept on servers on the site of the company.