weight of the cloud.
what’s in it for you?
• receive a concise overview of ‘the cloud’
• understand the resources necessary for on-demand file storage
• get a carbon estimate for digital duplicates
10-12 minute read
In this article we are going to dive into our digital refuse. The information we store, stuff and cram into the cloud. There are many wonderful articles on the history of cloud computing, and the origin of the term, which was born from a need to market emerging and future technologies [1]. We will skip that in favor of answering the question: how much do our digital duplicates cost to house in the cloud?
This question struck me one day while sitting in my local park in Japan photographing birds. It was early hanami or flower viewing season, when people stroll about appreciating the late plum and early cherry blossoms. I was distracted because in my peripheral vision, people kept appearing in the same spot, taking roughly the same stance, and then walking away, all within seconds. These individuals were taking quick snaps of virtually the exact same clump of cherry blossoms. Now that I had recognized the pattern, I could see it all over Tokyo, and it made me wonder: if this is happening year after year, how many effortless, un-composed, automatic, and likely forgotten photos are we uploading to ‘the cloud’? First, I had to figure out what the cloud was, because like its natural counterparts, it is as complex as it is simple.
what is ‘the cloud’?
The cloud provides on-demand resources, including services ranging from data storage, to proprietary platforms, and even entire software suites [2]. For example, almost all applications are now downloaded from the internet and have some sort of digital tether, or exist completely online like Google Suite and Microsoft 365.
There are at least three broad categories of clouds offering different “as-a-service” solutions (Figure 2), which give customers more or less customization. For example, companies need to tailor the cloud to their needs, but they do not have any significant ability to configure Google Drive, Drop Box, or Adobe Software. All cloud computing is part of the internet, and ‘the cloud’ refers specifically to different types of resources which one can access over an internet connection[2]. But unlike the clouds in the sky, the cloud is made of plastics, metals, and fiber optic cables.
When you move a file into Google Drive, there are many exchanges that happen almost instantly before your information encounters one of Google’s behemoth data centers (Figure 3). From your initial device, through your modem/router, to your internet service provider, which references a domain name system so it can bring you to the cloud website which is a digital interface for all of Google’s data centers. Then your information is broken into packets and sent along the very physical fiberoptic veins that make up the internet. Each of these processes and handoffs requires energy. Depending on your action, location, and size of your data, the process will be exchanged with various data centers of different scales [4]. Data centers are the cloud. A real, physical space, humming, buzzing, and churning out information 24/7. Of course, this convenience comes at a price.
an electric addiction.
This is where the idea of the ethereal nature of clouds breaks down. Unlike clouds in the sky, cloud computing networks provided by data centers require a lot of resources to maintain their unending uptime. There are a few main resources needed to run a data center which can vary greatly by scale, from local data centers to the latest hyperscale centers. The first cost is obviously electricity. It is not only used to power the center but can also be used to keep it cool.
The International Energy Agency (IEA) estimated that 200 terawatt hours (TWh) were used in 2020 to power global data centers, which collectively use more energy than some countries [5, 6]. As you can see in Figure 4, global data center electricity consumption is situated between Spain, which had a population of 47 million in 2020, and Poland, which had a population of 38 million in 2021 [7, 8].
Electricity can come from a variety of sources: fossil fuels, such as natural gas, coal, and petroleum, or many different renewable energies such as hydroelectric, solar, wind, ocean currents, etc. Estimates vary, but total data center carbon emissions range from 1-2% annually, which could make it similar to the airline industry [10, 11, 12, 13, 14]. Yet a 2016 study sponsored by the U.S. government found that the annual electricity use rate had remained relatively consistent from 2010 onward, despite the reach of data centers greatly increasing, a testament to industry efficiency gains, as shown in Figure 5 [15].
unquenchable thirst.
Water is the next most important resource because it is generally seen as the best option to keep things cool as it is much less expensive than using electricity. Ironically, data about data centers is very hard to come by. However, sometimes these demands for water spill into the legal arena giving us a glimpse of Google’s plans for a new data center in Texas.
The proposed data center is located near Red Oak, Texas, a suburb outside Dallas with a population of 17,000 people [16]. In Red Oak, the average household of 3 residents uses 60-80 gallons (227-303 liters) per person, per day [17], which is in line with a 2015 study by the U.S. Environmental Protection Agency [18]. Therefore, Red Oak city uses about 434 million gallons (1.64 billion liters) in a year. Google’s new data center has requested to use up to 1.46 billion gallons (5.53 billion liters) of water annually, which is over 3 times more than the entire city of Red Oak. Ellis county, where Red Oak resides, uses 15 billion gallons (56.78 billion liters) of water annually [19]. Google, therefore, wants approval to use up to about 10% of the county’s total yearly water consumption, or the equivalent of adding 47,000 more Red Oak citizens. To be fair to Google, this data center will serve, at minimum, the immediate area of Dallas, Ft. Worth and Arlington, Texas, which has a combined population of 8 million people. However, it certainly puts stress on local water supplies. This is why Google and Microsoft are experimenting with sinking data centers in the ocean, which will likely have other kinds of ecological consequences we cannot cover in this article [20].
quantifying the consumer cloud.
Now that we know what cloud computing is, and some of its major associated resource costs, we can return to our original question. How much duplicate data exists in the consumer cloud (Google Drive, Drop Box, iCloud, etc.), and what does that cost us to maintain?
We are using information for business-to-consumer (B2C) file storage because it is the most complete data we have to work with. Figure 6 shows the available market data for the top five B2C file storage providers. Facts and figures were cross-referenced where possible to create a conservative estimate within the bounds of reason, but this is certainly where the numbers get a little noisy, because big data does not like when you have information about them.
As an example, OneDrive data is from 2015 and Apple data is from 2018, and both Apple and Microsoft have grown their cloud business a lot in recent years. For free accounts, my working assumption is that customers would use 2GB of data per account. As for paid accounts, my original estimate was 250GB of data stored per user, but other reports put it as high as 500GB on average[29]. However, this figure seems reasonable given that people are most often backing up their computers, phones, and storing photos [30].
In total this is about 104 billion GBs of data in free and paid cloud file storage. Using an estimated carbon rate of 0.2 tons per 100GB per year (assuming a 2014 US energy market mix) [21], this would require about 213 million tons of CO₂ emissions each year to maintain. This represents about 0.39% of 2021 greenhouse gas emissions for consumer file storage alone. If the total cloud energy consumption is as high as 2% of emissions, then less than one quarter is consumer storage, and over three-quarters are business and enterprise cloud services. As a reminder, none of the data we have is perfect, for example: the carbon rate we are using is 10 years old, and there have been industry efficiency gains, but these figures are still useful to establish an upper boundary for our estimate.
reflections.
My own cloud storage tally comes to around 650 GB of data, a bit higher than average, no doubt, due to my photography. While it hurts to realize, my review revealed that well over half of this does not hold value to me any longer and could safely be deleted. This will not be true for everyone, but it does give us the last piece we need to form an estimate. Therefore, using 50% as a general benchmark we can calculate that 0.2% or almost one-quarter of a percent of global CO₂ emissions annually are used simply to house our digital dust. That number may seem small, but that amount of CO₂ emissions would be the equivalent of you flying around the Earth about 13 million times in a commercial jet each year [37]. If we assume the same for the corporate cloud, which could be ~3x greater, then combined there may be 0.5%-1% of total carbon emissions each year used to maintain files which no longer have value.
Admittedly, that was a lot of napkin math, but consider that we are expected to reach 175-200 zettabytes of information in the cloud by next year [38, 39]. For reference, 1 zettabyte is 1 trillion gigabytes. This exercise looking at consumer file storage only accounts for 0.02% of a single zettabyte.
Finally, there is one more element to this equation: the psychological toll. It’s not your fault that you don’t spend valuable time deleting closeups of your finger, out of focus flowers, or accidental screenshots. You were sold the idea that the cloud was free, an infinite data storage unit, a limitless space to deposit your digital lifestyle. But now you know the weight of the cloud.
our two cents
Unlike the pervasive marketing messaging that the cloud is just an unending expanse where we can ditch our digital clutter, in reality it is a collection of thousands of massive data centers scattered across the globe, each requiring a sizable city’s worth of electricity and water annually. Should we stop taking photos of cherry blossoms or anything that excites us? Not necessarily, but what we should do is ask for more transparency in the resource consumption of digital infrastructure projects. Only time will tell if industry efficiency gains and environmental efforts will keep pace with data storage projections. There is no point in building a robust online world if we are simply poisoning the one we exist in.
-
1. Regalado, Antonio. “Who Coined ‘Cloud Computing’?” MIT Technology Review, February 11, 2020. https://www.technologyreview.com/2011/10/31/257406/who-coined-cloud-computing/#:~:text=The%20notion%20of%20network%2Dbased,term%20to%20an%20industry%20conference.
2. “IaaS vs. PaaS vs. SaaS | IBM,” n.d. https://www.ibm.com/topics/iaas-paas-saas.
3. Singh, Mahendra. Information Systems for Sustainable Business. Independently Published, 2022.
4. Costenaro, David and Anthony Duer. “The Megawatts behind Your Megabytes: Going from Data-Center to Desktop.” (2012).
5. “World Power Consumption | Electricity Consumption | Enerdata,” n.d. https://yearbook.enerdata.net/electricity/electricity-domestic-consumption-data.html.
6. IEA. “Global Data Centre Energy Demand by Region, 2010-2022 – Charts – Data & Statistics - IEA,” n.d. https://www.iea.org/data-and-statistics/charts/global-data-centre-energy-demand-by-region-2010-2022.
7. Troya, María Sosa, María Sosa Troya, and María Sosa Troya. “Spain’s Population Falls by 106,000 People in 2020 after Four Years of Growth.” EL PAÍS English, April 21, 2021. https://english.elpais.com/society/2021-04-21/spains-population-falls-by-106000-people-in-2020-after-four-years-of-growth.html.
8. Gus. “Informacja o wynikach Narodowego Spisu Powszechnego Ludności i Mieszkań 2021 na poziomie województw, powiatów i gmin.” stat.gov.pl, n.d. https://stat.gov.pl/spisy-powszechne/nsp-2021/nsp-2021-wyniki-ostateczne/informacja-o-wynikach-narodowego-spisu-powszechnego-ludnosci-i-mieszkan-2021-na-poziomie-wojewodztw-powiatow-i-gmin,1,1.html.
9. Rooks, Timothy. “Data Centers Keep Energy Use Steady despite Big Growth.” Dw.Com, January 24, 2022. https://www.dw.com/en/data-centers-energy-consumption-steady-despite-big-growth-because-of-increasing-efficiency/a-60444548.
10. Energy.gov. “Data Centers and Servers,” n.d. https://www.energy.gov/eere/buildings/data-centers-and-servers.
11. Mytton, David. “Hiding Greenhouse Gas Emissions in the Cloud.” Nature Climate Change 10, no. 8 (July 13, 2020): 701. https://doi.org/10.1038/s41558-020-0837-6.
12. Monserrate, Steven Gonzalez. 2022. “The Cloud Is Material: On the Environmental Impacts of Computation and Data Storage.” MIT Case Studies in Social and Ethical Responsibilities of Computing, no. Winter 2022 (January). https://doi.org/10.21428/2c646de5.031d4553.
13. Masanet, Eric, Arman Shehabi, Nuoa Lei, Sarah Smith, and Jonathan Koomey. “Recalibrating Global Data Center Energy-Use Estimates.” Science 367, no. 6481 (February 28, 2020): 984–86. https://doi.org/10.1126/science.aba3758.
14. IEA. “Data Centres & Networks - IEA,” n.d. https://www.iea.org/energy-system/buildings/data-centres-and-data-transmission-networks.
15. “United States Data Center Energy Usage Report | Energy Technologies Area,” n.d. https://eta.lbl.gov/publications/united-states-data-center-energy.
16. United States Census Bureau QuickFacts. “U.S. Census Bureau QuickFacts: Red Oak City, Texas.” Census Bureau QuickFacts, n.d. https://www.census.gov/quickfacts/fact/table/redoakcitytexas/PST045222.
17. “Water | Red Oak, TX - Official Website,” n.d. https://www.redoaktx.org/1066/Water.
18. US EPA. “Statistics and Facts | US EPA,” April 24, 2023. https://www.epa.gov/watersense/statistics-and-facts.
19. Bloomberg, Nikitha Sattiraju /. “The Secret Cost of Google’s Data Centers: Billions of Gallons of Water to Cool Servers.” TIME, April 2, 2020. https://time.com/5814276/google-data-centers-water/.
20. Bloomberg, Nikitha Sattiraju /. “The Secret Cost of Google’s Data Centers: Billions of Gallons of Water to Cool Servers.” TIME, April 2, 2020. https://time.com/5814276/google-data-centers-water/.
21. Magazine, Stanford. “STANFORD Magazine,” n.d. https://stanfordmag.org/contents/carbon-and-the-cloud.
22. Curry, David. “Apple Statistics (2024).” Business of Apps. Accessed May 5, 2023. https://www.businessofapps.com/data/apple-statistics/.
23. Keizer, Gregg. “Microsoft’s OneDrive Changes: Follow the Money.” Computerworld, November 9, 2015. https://www.computerworld.com/article/3003140/microsofts-onedrive-changesfollow-the-money.html.
24. “Mega Transparency Report.” Mega, September 2021. Accessed May 5, 2023. https://web.archive.org/web/20211020234056/https://mega.io/Mega_Transparency_Report_September_2021.pdf.
25. Microsoft News Center. “Microsoft Cloud Strength Drives Fourth Quarter Results - Stories.” Stories, July 22, 2020. https://news.microsoft.com/2020/07/22/microsoft-cloud-strength-drives-fourth-quarter-results-2/.
26. Stagnitto, Jason. “37 Cloud Computing Statistics, Facts & Trends for 2024.” Cloudwards, February 2, 2024. https://www.cloudwards.net/cloud-computing-statistics/.
27. SiliconANGLE. “Dropbox Delivers Another Quarter of Steady Revenue Growth, Pleasing Investors,” May 5, 2023. https://siliconangle.com/2023/05/04/dropbox-delivers-another-quarter-steady-revenue-growth-pleasing-investors/.
28. Sebastian, Nathan. “Usage & Trends of Personal Cloud Storage: GoodFirms Research,” n.d. https://www.goodfirms.co/resources/personal-cloud-storage-trends.
29. Armstrong, Martin. “What’s in the Cloud?” Statista Daily Data, September 30, 2021. https://www.statista.com/chart/25896/gcs-cloud-storage-services-usage/.
30. Novet, Jordan. “The Case for Apple to Sell a Version of iCloud for Work.” CNBC, February 14, 2018. https://www.cnbc.com/2018/02/11/apple-could-sell-icloud-for-the-enterprise-barclays-says.html.
31. Duarte, Fabio. “Google Workspace User Stats (2024).” Exploding Topics (blog), December 6, 2023. https://explodingtopics.com/blog/google-workspace-stats.
32. Dean, Brian. “Dropbox Usage and Revenue Stats (2023).” Backlinko, August 21, 2023. https://backlinko.com/dropbox-users.
33. MacroTrends. “Dropbox Revenue 2016-2023 | DBX,” n.d. https://www.macrotrends.net/stocks/charts/DBX/dropbox/revenue.
34. Dimitrov, Ivan. “Stacks of Storage: How Much Space Does Your Data Take up? - The pCloud Blog.” The pCloud Blog, December 3, 2020. https://blog.pcloud.com/stacks-of-storage/.
35. Shen, Kai, Anand Swaminathan, Xiaoxiao Tong, and Kevin Wei Wang. “Cloud in China: The Outlook for 2025.” McKinsey & Company, July 8, 2022. https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/cloud-in-china-the-outlook-for-2025.
36. Dropbox. “Dropbox Announces Fourth Quarter and Fiscal 2022 Results | Dropbox,” n.d. https://investors.dropbox.com/news-releases/news-release-details/dropbox-announces-fourth-quarter-and-fiscal-2022-results.
37. “Myclimate – Your Partner for Effective Climate Protection,” January 30, 2024. https://www.myclimate.org/en/.
38. Daigle, Brian. “Data Centers Around the World: A Quick Look.” United States International Trade Commission. United States International Trade Commission, May 2021. https://www.usitc.gov/publications/332/executive_briefings/ebot_data_centers_around_the_world.pdf.
39. Morgan, Steve. “The 2020 Data Attack Surface Report,” n.d. https://cybersecurityventures.com/wp-content/uploads/2020/12/ArcserveDataReport2020.pdf.