Thoughts on the Internet and Climate

I think we're all aware of how power-hungry our lifestyles are. At least to some extent. Lately I've been pondering some of the ways the internet is hurting our climate. My current train of thought on this started with two things: Bitcoin/cryptocurrencies, and the energy consumption of always-on home electronics. Because the internet really is a huge amount of always-on electronics. Granted that some of it is used very efficiently, but... Are we efficiently solving problems that don't really exist?

Bitcoin is a Climate Disaster

The BTC network consumes more electricity annually than a number of first-world countries (Argentina and the Netherlands are the ones I remember right now). But what does it really do? I mean, I definitely see the freedom/decentralisation arguments for cryptocurrencies, but from its inception BTC has primarily been used for financial speculation, secondarily for illicit and unethical transaction, and only to a very very tiny extent actual good.

"Bitcoin consumes 'more electricity than Argentina'", BBC

We've essentially found a very good way of directly translating massive power usage from mostly coal powerplants into money for financial speculators and criminals. I don't see how that's good in any way. I mean, I could spend the rest of this post ranting about how horribly bad cryptocurrencies are for virtually everyone and society as a whole. But let's leave it here for now.

Data in Transit

We send a lot of data. A lot. Purportedly something like 98+% of emails sent are spam. Web crawlers are visiting my blog (and probably yours) at a rate massively disproportionate to the human visitors. This isn't just my blog, either:

Bots are responsible for more internet traffic than humans.

It's a fair assumption that most of these data transfers do not improve the lives or experiences of users at all. I haven't posted anything since February 9th; all the crawling and feed polling done on my site since then is simply wasted energy from an optimization perspective. Not to mention all the email sent that reaches its target only to be sent to the spam folder -- or worse actually wastes the time and possibly money of its human recipient. A large portion of bot traffic is also nefarious; often used to simulate human behaviour for one reason or another (fooling ad networks to pay out for clicks that weren't generated by actual users, for example), or look for exploits in networks, servers, and clients. Or plain attacks. One sustained large scale DDOS attack can generate quite large amounts of traffic, and attacks of all scales take place all the time.

A lot of data in transit is also TLS certificates. These sometimes outweigh the resource they're protecting (my atom feed, for example, is smaller than my Let's Encrypt certificate). Of course not using encryption is really not an option, but it's sad that that's the case.

Data at Rest

According to a colleague of mine who used to work at Google all my gmail emails are stored in at least seven separate locations. Most of it is on spinning disks, because SSDs are expensive. All of it is saved.

Consider that for a moment. How many emails do you have that include the entire thread the message is a part of? (I never understood why this is the default behaviour of email clients.) How many emails do you have that are older than a couple of months but you still need to read sometime? How many spam emails do you think have been saved over the years for machine learning spam detection to train on?

For that matter, how much data do you have on cloud drives or online photo albums? For every GB of data you see in a cloud service I guarantee you there is at least three backups. At the data centre I used to work at one of the systems had twelve different fully provisioned environments for different purposes. Plus a backup server with a number of the latest database backups, that was itself backed up. The main database in the system was 1.5TB, and each environment had a number of servers with at least 100GB of drive apart from the database itself. Adding another 100GB to the database disk was something that truly incurred costs, but how much energy it expended was something we never looked at. Embodied energy for the disks and server hardware, and electricity consumption while running.

It adds up. This is why I host my blog and gemlog on my own hardware instead of a VPS. Sure, my Raspberry Pi consumes energy all the time, but it's a tiny bit. And most of all I only have one backup, on an SSD (actually both my main drive and backup drives are 128GB USB thumbdrives).

Machine Learning

Speaking of machine learning, the process of training any sort of machine learning system is ridiculously energy intensive. Running a trained system isn't very demanding, but training it requires a whole lot of hardware and electricity.

I Don't Have Solutions

I mean, I could go on and on here. The amount of embodied energy in redundant hardware alone is probably staggering, for example. 95% uptime is cheap and easy, but 99.999% requires more resources by a magnitude. How do we stop and think about what we need, as opposed to what we want?

I'm no better than you here, dear reader. I own more computers than I use, I subscribe to no less than three streaming services, I run a server for fun.

Stuff like this scares me; if I take virtually any aspect of modern life and zoom in it turns out to be a gargantuan ecological disaster. It's heartbreaking, and it's easy to get disillusioned and cynical. I believe we can improve as a society, though. Talking about it and thinking about it is a good first step.

-- CC0 Björn Wärmedal