Self-hosting - my journey - Part 1 of 3: Setting up a self-hosting system
I get the impression that more people, especially here in the /e/-community, try to reclaim their privacy. To me, privacy means having control of a place where I store my private data. It has been more than five years since I started to self-host my own private data and I believe sharing my experiences, struggles and thoughts during this process with this community might be beneficial.
The process for self-hosting can take different ways with different levels of difficulty - from a “do it all yourself” approach to “buy a product that does the work”. This text reflects my experiences for the “do it yourself” approach. This also means, that this text will describe technical aspects in some places. However, I try to mention alternatives and shortcuts, which I came into contact with.
The target audience is people who want to start self-hosting and would like to get some pointers where to begin. A text like this can not cover all the detailed work that is necessary to setup such a self-hosted system, but I believe that it can give a general guideline to help making one step after the other.
Also, such a journey entails setbacks, getting stuck and all kinds of demotivating situations, so be prepared to read manuals, dig through tech-forums, get used to searching for error messages or even creating bug reports for software, you want to use. Maybe there come times when it is healthy to let the system stand for a week and clear ones head.
This first part is about setup options and getting a basic system running.
The second part will cover software packages used for the required services.
The third and final part is about additional related topics and concludes this series.
Motivation
Looking back, my initial main motivation for setting up a self-hosting system was laziness.
It all started when I was about to purchase my first smartphone and thought about the process of transferring my contact data to the new phone. Entering all the data seemed like a tedious task (which I already did with my “pre-smartphone-mobiles”) and I wanted to have a more elegant solution - one that I would not have to repeat for every new device.
Seeing how other family members struggled every time, they had to transfer their contact data each time they purchased a new phone (contacts were stored on a mixture of SIM card, phone-memory and Google-accounts), I decided to implement a centralized solution for myself.
I discovered CardDAV, which seemed to be a reasonable standard that also looked widely adapted.
Looking at the sibling-protocol CalDAV it seemed like a good idea to create a centralized calendar, which I was able to access from different devices.
A further incentive for self-hosting was to have a backup archive for my E-Mail conversations, source code and important documents, which at that time would get a backup at most once or twice a year.
Setup
Self-hosting in the widest interpretation can take different shapes: For example a website at a web hosting provider, a managed or unmanaged rented server or a server that is located at home and reachable through ones own internet provider.
Having a website is nice and easy to setup and to maintain, but it lacked several aspects of my intended goals. For example I was unable to find a reasonable way to archive my E-Mails in this scenario, so this solution was not suitable for me.
Renting an own virtual server at a hosting provider was something that I considered and that seemed to meet my needs, but I decided against that since I did not want to have my private data lying around on the servers of some company.
So my approach was finally to setup a server at home connected to my router that would be accessible from the internet, allowing me to access my contacts, calendar and other services from anywhere.
Hardware
Some companies offer prebuilt and preconfigured servers for exactly this purpose. In order to decide if the functionality they offer suits ones needs, be prepared to invest time to read their manuals. Since I wanted to be as flexible as possible, I decided against such a preconfigured server. However, for someone who gets just started, this might be a reasonable choice.
If the only use case is to access files remotely then a Network-attached storage (NAS) might be all one needs and there are companies that specialize in and sell NAS-servers; a minimal solution in that case would be to use a router, that allows remote access to an attached external hard disk.
Back when I started, smaller Single-board computers were just starting to get traction, so they were not on my radar, but nowadays they seem like a reasonable alternative.
During my research for the hardware of my first server, I had to take several aspects into account.
- Location: The server would be located at a place where it could be heard even during night, so my first premise was that it had to be absolutely noiseless.
- Running cost: A server running 24/7 is nice in wintertime for additional heating, but this also means that it would consume power, so I looked into hardware that had minimal power consumption
- Size: Since I did not need big extensibility for this server, a very small chassis would be my ideal setup, so that I could place it near my router.
I ended up with a Mini ITX board, a passively cooled case without fans, 2GB RAM, a 64GB SSD and a 36W power supply for around 230€.
Two years later it got upgraded to 16GB RAM, 500GB SSD + 60W power supply.
At the moment I am on my third generation server. Unfortunately at the time this upgrade was due, there were no suitable boards available that had low power requirements, so this time I went with a passively cooled 65 TDP-CPU and 120W power supply, even though it is far too overpowered for my needs as a server (currently I am donating some CPU cycles to an open source project).
One issue that I always try to minimize is to risk a hard disk failure. Setting up and restoring the data is a lot of work and so I replace the hard disk at least once every two years.
For setting up the system one should keep in mind, that a screen, a network-cable and a keyboard come in very handy, while during operation I usually have only the network-cable attached, which also allows me to maintain it remotely.
I gave some thought to Wi-Fi-vs-cable, but since cable is much more reliable, I went with that.
Operating System
Operating systems come in many different variants.
- For me Windows was out of the question, because my intention of using open source software would be much more difficult to implement on such a server.
- Since I never tried macOS, I lack knowledge about using macOS as operating system for a self-hosting system.
- The BSD family of operating systems contain good candidates for such a system from what I can tell, but I came in touch with these only a few times.
- Before I started the whole self-hosting process, I already had experience with Linux, so for me it was natural to install the Linux distribution of my choice on the newly acquired server.
Besides the usual aspects for choosing an operating system (familiarity, hardware support, security, stability, community & support, up-to-dateness) one additional aspect for choosing a distribution should be about software packages. Open source software comes in different shapes in order of difficulty of installation:
-
source code:
Installing software from the source code is usually the most difficult method, as it requires the installation of all prerequisites necessary to compile and run the software. -
binary zipped format:
Installing zipped packages requires one to install additional packages which are essential to run the software. -
binary distribution package:
Some open source software products provide packages for certain OS-distributions, that contain metadata about which additional software is required, in order to run that software. This makes an installation simpler, as the dependencies can be installed automatically. While there are several package formats, I would like to set apart deb and RPM because of their widespread usage, which means that software projects are more likely to provide packages in these formats than in other formats. -
part of the package repository of the operating system:
This is the easiest way to install software because it is available via the package manager of the operating system.
Choosing a distribution that has a large repository of packages will make it easier to setup the whole system, but this might come with drawbacks like outdated packages.
Usually all operating systems provide adequate information about their installation process, which will guide one through the complete installation.
At one point during the installation there usually is the question for the language, that should be used and two options are likely to cross ones mind: the own native language and English (if they are not the same). The advantage of using ones native language is that it is easier to understand, however there are more documentation and tutorials available for English, so it might be easier to find help, when searching for English terms and error-messages.
Domain names and DynDNS
An easy way to initiate communications to the server from the outside is by using a domain name.
Short excursion:
Computers talk with each other via IP addresses (like 51.15.106.51). Since they are difficult to remember, Domain Name System (DNS) was invented to translate human readable domain names like “community.e.foundation” to their computer readable counterpart “51.15.106.51”. Dynamic-DNS (DynDNS) extends the functionality of DNS servers and allows users to easily and quickly change the IP address of a domain name on the fly.
My dial-up connection resets once every 24 hours which also changes the outside IP-address of my server. So in order to be able to access it consistently from the outside with my chosen domain name, I have to apply the services of a DynDNS provider.
While my chosen DynDNS provider offers DynDNS free of charge for subdomains of a few selected domains, I went for the option of having my own domain name for this server.
Currently the domain name costs me about 10€ per year and the DynDNS service $30 per year.
Some routers also offer free DynDNS services, however mostly they allow only predefined domain names.
Ports & Port forwarding
Servers usually have to talk to multiple different clients at the same time. In order to make this possible, Ports as communication-endpoints were invented. To each IP address ports numbered from 1 to 65535 are associated, through which the computer associated with an IP address can be reached.
Some of these port numbers have a standard usage, for example port 80 is the default port for accessing unencrypted websites; when the URL of a website starts with “http://”, the browser tries to access the server on Port 80. There is a list of default ports on Wikipedia available.
After setting up DynDNS for the dial-up connection (let’s assume that the domain name is example.com), try to reach the server http://example.com with the web browser and most likely one will get no connection or as response something like
Unable to connect to remote host: Connection refused
Let’s see what happens here:
- The web browser queries the IP address of the domain “example.com” and gets the according IP address A.B.C.D as response
- The web browser tries to get a connection on port 80 on the IP address A.B.C.D, because it is the default port for “http”
So the question is: What is the server that is reachable at the IP address A.B.C.D? It is usually the router that is plugged into the connector in ones wall and most routers in their factory settings deny access on all ports from the outside.
At this point it is necessary to forward the request of the web browser on port 80 from the router to ones server. This can be configured in the administrative panel of the router and is called Port forwarding.
For port forwarding a few details are necessary:
- The external port: in the example above: 80
- Protocol: Most applications use TCP, so this should be selected and others (e.g. UDP) ignored.
- IP address of the target: internal IP address of the new server (the administrative panel of the router usually contains a list of all connected devices and their respective IP addresses)
- Port of the target: the default 80 is a reasonable choice
After setting up a web server on ones server, reloading the website in the web browser should result in the display of a welcome web page. Web servers usually provide a simple welcome website in their default configuration when accessed on port 80.
Let’s see what happens in this case:
- The web browser queries the IP address of the domain “example.com” and gets the according IP address A.B.C.D as response
- The web browser tries to get a connection on port 80 on the IP address A.B.C.D, because it is the default port for “http”
- The router receives a connection on port 80 and forwards it to the server on port 80
- On port 80 of the server the web server listens and gives as answer the content of the welcome website
- The answer is transferred to the router and the router sends the answer back to the web browser
- Finally the web browser displays the website and one can watch it
Finally the own server is accessible from the internet.